Software

Introduction

Here I will add some software produced during my research. I program basically in two languages: I use Matlab for initial development stages and research, and then I port my results to C++ with the help of some auxiliary library I have created for that purpose. For this reason, some of the software in this page will be released in Matlab language and the rest in C++.

License

All the software I release is provided as open-source free to use code. I believe in the principle of reproducible research, which requires that the results reported in specific papers can be reproducible by any other researcher. This is why I will produce open source releases of my methods, for other researchers to use them. Note that the main interest of this releasing of software is to provide the research community with standard implementation of methods that allows for a fair and direct comparison.
Also, I want to emphasize that all the software available in this page is provided as is, without ANY kind of warranty, and the author do not take ANY responsibility for the results of its usage.
If you like this software, I will consider enough payment if you let me know your opinion, and also if you let me know in which application was used successfully.
Also I will appreciate if you report the existence of bugs. I am pretty sure that it will contain a lot of bugs, because I was up to now the only user, and I only used it in very restricted applications and examples, so please be careful with the usage of the software.

Downloads


SPeech Processing library (SPP)

This library is written in C++, although basically it is C with some flavour of C++ (for example, for operator and functions overloading). I almost did not use classes and templates provided by C++, for speed, compatibility and portability reasons. The main purpose of this library is to provide a generic set of functions to manipulate speech signals. As the signals are usually represented as vectors, and the processing on them can be presented in terms of matrix-vector operations, the library is built around a set of matrix and vector types, both of real and complex values, and basic operations and functions to modify and operate over them.
On top of these basic building blocks, a lot of high level processing, like filtering, transformations and analysis, are produced. The library is fully documented using Doxygen, and it provides more than 700 functions.
I have compiled it successfully in several computers under Linux operating system (Fedora 6, 7, 8, 9, Ubuntu 6, 7, 8), both in machines with 32 and 64 bits architectures, and also under Windows XP using Mingw and Msys, MS Visual C++ 5, MS Visual C++ 2005 Express edition and Borland C++ Compiler 5.5 (for the last three, some minor changes and option configurations were needed).
Most of the library was written by me, Leandro Di Persia. In some specific functions, I have used some codes from other people, usually adapting them to the data types of this library and correcting minor bugs, and sometimes porting them from their original languages. In those cases I have left an acknowledgement to the authors in the specific files. Also the file AUTHORS describe the authorship of all the code.

Update: Version 2 is now available, which has new functionalities, many bug fixed, speed improvements and includes more examples and test programs.

Requirements

There are not specific requirements. The library does not uses any extremely complicated characteristics, nor any external library. It is completely self-contained.

Installation

The usual Linux build structure using Makefile is provided. For other architectures, (MS visual C++ under Windows for example) some projects should be created.
The file INSTALL has instructions for building and installing the library. Note that only a static version of the library is provided.

Download

Version 1 source: tar.gz
Version 2 source: tar.gz

Back to top


DWT Library

This is a standalone library for calculating the Dyadic orthogonal wavelet transform of signals. It includes two families of wavelet filters: Daubechies and Symlets.
It uses the fast pyramid algorithm for a very fast trasform. The filtering is performed by circular convolutions to speed up the calculations. It uses periodic extension for the filtering.
There are two versions of the code: C++ and Matlab. Both versions were tested to produce the same results as the Wavelet toolbox of Matlab using periodization. The Matlab version will
be faster, however, because the convolution and downsampling (or upsampling and convolution at the reconstruction) is performed simultaneously to reduce the number of operations.
For the data types the C++ version uses the standard vector<double> data so it can be easily incorporated into any C++ code.

Requirements

The library has no particular requirements, it only needs the stadard c++ library to compile.

Installation

The file INSTALL contains detailed installation instructions.

Download

Source: tar.gz

Back to top


Trim

This is a small program that I used sometimes to cut the extremes of a speech signal. When a recording is done, sometimes a rather large duration silence appears at both ends of the sentence. This can be a negative aspects in some research areas. For example, if one wants to artificially contaminate that signal with noise at certain SNR, the noise will be active during all time, but the speech will be active in a shorter period. This will cause a different SNR (with more power during the speech part) than desired.
Note that this is not a Voice Activity Detector (VAD), it also cannot detect pauses inside the sentence, as it scans the signal from the beginning and the end, searching for enough energy, thus only can set two points for cutting the signal.

Requirements

This program uses functions from SPP library (see above). You need to have SPP installed in your system for this program to compile.

Installation

The file INSTALL contains detailed installation instructions.

Download

Source: tar.gz

Back to top


Mixer

This program is aimed at producing synthetic convolutive mixtures of 2 sources, as measured by 2 microphones. For this, it needs the Impulse Responses (IR) measured from each source to each microphone. The program allows to specify a desired power ratio to use. This power ratio is adjusted in the sources (that is, it simulate the output power relation of the sources. The final SNR of the mixtures will depend on the characteristics of the IRs.

Requirements

This program uses functions from SPP library (see above). You need to have SPP installed in your system for this program to compile.

Installation

The file INSTALL contains detailed installation instructions.

Download

Source: tar.gz

Back to top


RevTime

This program is produced to measure the acoustical properties of a room using an impulse response measured on it. The measures are the reverberation times $T_{20}$ and $T_{30}$, the Early Decay Time (EDT), the clarity indexes $C_{50}$ and $C_{80}$, and the Definition index $D_{80}$. All these measurements are calculated using the global IR, without filtering. Also, the $T_{20}$, $T_{30}$ and EDT are measured in octave bands according to ISO standard. The reverberation time measurements are done using the Schroeder reverse integration for smoothing the energy decay curve, and an estimation of noise threshold is done using the last 20% samples of the IR.

Requirements

This program uses functions from SPP library (see above). You need to have SPP installed in your system for this program to compile.
The program also can generate figures of the Energy Decay curves in .eps and .pdf formats. For this part, Dislin library is used (see http://www.mps.mpg.de/dislin/). If you do not have this library installed, or you do not need the production of the graphics, you need to edit the Makefile to adapt it for this case.

Installation

The file INSTALL contains detailed installation instructions.

Download

Source: tar.gz

Back to top


InverseIR

This program produces an optimal LMS FIR estimation of the inverse of a room impulse response (IR). Real measured impulse responses are usually not minimum-phase, which means that they do not have a exact inverse. This program solves the problem by producing an optimal in the LMS sense FIR estimation of the inverse. Given the input impulse response in wav format and the length of the desired inverse, this program produces the estimation of the inverse and also applies it to the original IR to find the global result. Both the inverse are saved in wav format. The program also needs the desired delay to estimate properly the inverse. Different delays will produce different estimations of the inverse.

Requirements

This program uses functions from SPP library (see above). You need to have SPP installed in your system for this program to compile.

Installation

The file INSTALL contains detailed installation instructions.

Download

Source: tar.gz

Back to top


FDBSS

This program is a demo for our frequency-domain Blind Source Separation method using the pseudoanechoic model, as presented in:
- L. Di Persia, D. Milone and M. Yanagida, "Indeterminacy free frequency-domain blind separation of reverberant audio sources", IEEE Transactions on Audio, Speech and Language Processing, Vol. 17, NÂș2, pp. 299-311, 2009.

It is prepared to work on a 2-by-2 convolutive mixture. It uses a robust frequency-domain ICA approach to estimate the mixture parameters and then synthesize the separation matrices for each frequency bin using this information. It also apply a Wiener-like postfilter to improve the separation, by reducing the effect of echoes.
To test the method, some examples of real mixtures are given, with a small script to run the program on them.

Requirements

This program uses functions from SPP library (see above). You need to have SPP installed in your system for this program to compile.

Installation

The file INSTALL contains detailed installation instructions.

Download

Source: tar.gz

Back to top


Comments

Add a New Comment
or Sign in as Wikidot user
(will not be published)
- +

Back to top

Unless otherwise stated, the content of this page is licensed under Creative Commons Attribution-Share Alike 2.5 License.