pysptk

A python wrapper for Speech Signal Processing Toolkit (SPTK).

https://github.com/r9y9/pysptk

The wrapper is based on a modified version of SPTK (r9y9/SPTK)

Full documentation

A full documentation of SPTK is available at http://sp-tk.sourceforge.net. If you are not familiar with SPTK, I recommend you to take a look at the doc first before using pysptk.

Demonstration notebooks

Installation guide

Installation guide

The latest release is availabe on pypi. Assuming you have already numpy installed, you can install pysptk by:

pip install pysptk

If yout want the latest development version, run:

pip install git+https://github.com/r9y9/pysptk

or:

git clone https://github.com/r9y9/pysptk
cd pysptk
python setup.py develop # or install

This should resolve the package dependencies and install pysptk property.

Note

If you use the development version, you need to have cython (and C compiler) installed to compile cython module(s).

For Windows users

There are some binary wheels available on pypi, so you can install pysptk via pip without cython and C compilier if there exists a binary wheel that matches your environment (depends on bits of system and python version). For now, wheels are available for:

  • Python 2.7 on 32 bit system
  • Python 2.7 on 64 bit system
  • Python 3.4 on 32 bit system

If there is no binary wheel available for your environment, you can build pysptk from the source distribution, which is also available on pypi. Note that in order to compile pysptk from source in Windows, it is highly recommended to use Anaconda , since installation of numpy, cython and other scientific packages is really easy. In fact, continuous integration in Windows on AppVeyor uses Anacona to build and test pysptk. See pysptk/appveyor.yml for the exact build steps.

API documentation

API

Core SPTK API

All functionality in pysptk.sptk (the core API) is directly accesible from the top-level pysptk.* namespace.

For convenience, vector-to-vector functions (pysptk.mcep, pysptk.mc2b, etc) that takes an input vector as the first argment, can also accept matrix. As for matrix inputs, vector-to-vector functions are applied along with the last axis internally; e.g.

mc = pysptk.mcep(frames) # frames.shape == (num_frames, frame_len)

is equivalent to:

mc = np.apply_along_axis(pysptk.mcep, -1, frames)

Warning

The core APIs in pysptk.sptk package are based on the SPTK’s internal APIs (e.g. code in _mgc2sp.c), so the functionalities are not exactly same as SPTK’s CLI. If you find any inconsistency that should be addressed, please file an issue.

Note

Almost all of pysptk functions assume that the input array is C-contiguous and has float64 element type. For vector-to-vector functions, the input array is automatically converted to float64-typed one, the function is executed on it, and then the output array is converted to have the same type with the input you provided.

Library routines
agexp(r, x, y) Magnitude squared generalized exponential function
gexp(r, x) Generalized exponential function
glog(r, x) Generalized logarithmic function
mseq() M-sequence
Adaptive cepstrum analysis
acep(x, c[, lambda_coef, step, tau, pd, eps]) Adaptive cepstral analysis
agcep(x, c[, stage, lambda_coef, step, tau, eps]) Adaptive generalized cepstral analysis
amcep(x, b[, alpha, lambda_coef, step, tau, …]) Adaptive mel-cepstral analysis
Mel-generalized cepstrum analysis
mcep(windowed[, order, alpha, miniter, …]) Mel-cepstrum analysis
gcep(windowed[, order, gamma, miniter, …]) Generalized-cepstrum analysis
mgcep(windowed[, order, alpha, gamma, …]) Mel-generalized cepstrum analysis
uels(windowed[, order, miniter, maxiter, …]) Unbiased estimation of log spectrum
fftcep(logsp[, order, num_iter, …]) FFT-based cepstrum analysis
lpc(windowed[, order, min_det]) Linear prediction analysis
MFCC
mfcc(x[, order, fs, alpha, eps, window_len, …]) MFCC
LPC, LSP and PARCOR conversions
lpc2c(lpc[, order]) LPC to cepstrum
lpc2lsp(lpc[, numsp, maxiter, eps, loggain, …]) LPC to LSP
lpc2par(lpc) LPC to PARCOR
par2lpc(par) PARCOR to LPC
lsp2sp(lsp[, fftlen]) LSP to spectrum
Mel-generalized cepstrum conversions
mc2b(mc[, alpha]) Mel-cepsrum to MLSA filter coefficients
b2mc(b[, alpha]) MLSA filter coefficients to mel-cesptrum
c2acr(c[, order, fftlen]) Cepstrum to autocorrelation
c2ir(c[, length]) Cepstrum to impulse response
ic2ir(h[, order]) Impulse response to cepstrum
c2ndps(c[, fftlen]) Cepstrum to Negative Derivative of Phase Spectrum (NDPS)
ndps2c(ndps[, order]) Cepstrum to Negative Derivative of Phase Spectrum (NDPS)
gc2gc(src_ceps[, src_gamma, dst_order, …]) Generalized cepstrum transform
gnorm(ceps[, gamma]) Gain normalization
ignorm(ceps[, gamma]) Inverse gain normalization
freqt(ceps[, order, alpha]) Frequency transform
mgc2mgc(src_ceps[, src_alpha, src_gamma, …]) Mel-generalized cepstrum transform
mgc2sp(ceps[, alpha, gamma, fftlen]) Mel-generalized cepstrum transform
mgclsp2sp(lsp[, alpha, gamma, fftlen, gain]) MGC-LSP to spectrum
F0 analysis
swipe(x, fs, hopsize[, min, max, threshold, …]) SWIPE’ - A Saw-tooth Waveform Inspired Pitch Estimation
rapt(x, fs, hopsize[, min, max, voice_bias, …]) RAPT - a robust algorithm for pitch tracking
Excitation generation
excite(pitch[, hopsize, interp_period, …]) Excitation generation
Window functions
blackman(n[, normalize]) Blackman window
hamming(n[, normalize]) Hamming window
hanning(n[, normalize]) Hanning window
bartlett(n[, normalize]) Bartlett window
trapezoid(n[, normalize]) Trapezoid window
rectangular(n[, normalize]) Rectangular window
Waveform generation filters
poledf(x, a, delay) All-pole digital filter
lmadf(x, b, pd, delay) LMA digital filter
lspdf(x, f, delay) LSP synthesis digital filter
ltcdf(x, k, delay) All-pole lattice digital filter
glsadf(x, c, stage, delay) GLSA digital filter
mlsadf(x, b, alpha, pd, delay) MLSA digital filter
mglsadf(x, b, alpha, stage, delay) MGLSA digital filter
Utilities for waveform generation filters
poledf_delay(order) Delay for poledf
lmadf_delay(order, pd) Delay for lmadf
lspdf_delay(order) Delay for lspdf
ltcdf_delay(order) Delay for ltcdf
glsadf_delay(order, stage) Delay for glsadf
mlsadf_delay(order, pd) Delay for mlsadf
mglsadf_delay(order, stage) Delay for mglsadf

Other conversions

Not exist in SPTK itself, but can be used with the core API. Functions in the pysptk.conversion module can also be directly accesible by pysptk.*.

mgc2b(mgc[, alpha, gamma]) Mel-generalized cepstrum to MGLSA filter coefficients
sp2mc(powerspec, order, alpha) Convert spectrum envelope to mel-cepstrum
mc2sp(mc, alpha, fftlen) Convert mel-cepstrum back to power spectrum
mc2e(mc[, alpha, irlen]) Compute energy from mel-cepstrum

High-level interface for waveform synthesis

Module pysptk.synthesis provides high-leve interface that wraps low-level SPTK waveform synthesis functions (e.g. mlsadf),

Synthesizer
class pysptk.synthesis.Synthesizer(filt, hopsize)

Speech waveform synthesizer

Attributes:
filt : SynthesisFilter

A speech synthesis filter

hopsize : int

Hop size

synthesis(source, b)

Synthesize a waveform given a source excitation and sequence of filter coefficients (e.g. cepstrum).

Parameters:
source : array

Source excitation

b : array

Filter coefficients

Returns:
y : array, shape (same as source)

Synthesized waveform

synthesis_one_frame(source, prev_b, curr_b)

Synthesize one frame waveform

Parameters:
source : array

Source excitation

prev_b : array

Filter coefficients of previous frame

curr_b : array

Filter coefficients of current frame

Returns:
y : array

Synthesized waveform

SynthesisFilters
LMADF
class pysptk.synthesis.LMADF(order=25, pd=4)

LMA digital filter that wraps lmadf

Attributes:
pd : int

Order of pade approximation. Default is 4.

delay : array

Delay

filt(x, coef)

Filter one sample using using lmadf

Parameters:
x : float

A input sample

coef: array

LMA filter coefficients (i.e. Cepstrum)

Returns:
y : float

A filtered sample

MLSADF
class pysptk.synthesis.MLSADF(order=25, alpha=0.35, pd=4)

MLSA digital filter that wraps mlsadf

Attributes:
alpha : float

All-pass constant

pd : int

Order of pade approximation. Default is 4.

delay : array

Delay

filt(x, coef)

Filter one sample using mlsadf

Parameters:
x : float

A input sample

coef: array

MLSA filter coefficients

Returns:
y : float

A filtered sample

MGLSADF
class pysptk.synthesis.MGLSADF(order=25, alpha=0.35, stage=1)

MGLSA digital filter that wraps mglsadf

Attributes:
alpha : float

All-pass constant

stage : int

-1/gamma

delay : array

Delay

filt(x, coef)

Filter one sample using mglsadf

Parameters:
x : float

A input sample

coef: array

MGLSA filter coefficients

Returns:
y : float

A filtered sample

AllPoleDF
class pysptk.synthesis.AllPoleDF(order=25)

All-pole digital filter that wraps poledf

Attributes:
delay : array

Delay

filt(x, coef)

Filter one sample using using poledf

Parameters:
x : float

A input sample

coef: array

LPC (with loggain)

Returns:
y : float

A filtered sample

AllPoleLatticeDF
class pysptk.synthesis.AllPoleLatticeDF(order=25)

All-pole lttice digital filter that wraps ltcdf

Attributes:
delay : array

Delay

filt(x, coef)

Filter one sample using using ltcdf

Parameters:
x : float

A input sample

coef: array

PARCOR coefficients (with loggain)

Returns:
y : float

A filtered sample

Synthesis filter interface
class pysptk.synthesis.SynthesisFilter

Synthesis filter interface

All synthesis filters must implement this interface

filt(x, coef)

Filter one sample

Parameters:
x : float

A input sample

coef : array

Filter coefficients

Returns:
y : float

A filtered sample

Utilities

Audio files
example_audio_file() Get the path to an included audio example file.
Mel-cepstrum analysis
mcepalpha(fs[, start, stop, step, num_points]) Compute appropriate frequency warping parameter given a sampling frequency

Developer Documentation

Developer Documentation

Design principle

pysptk is a thin python wrapper of SPTK. It is designed to be API consistent with the original SPTK as possible, but give better interface. There are a few design principles to wrap C interface:

  1. Avoid really short names for variables (e.g. a, b, c, aa, bb, dd)

    Variable names should be informative. If the C functions have such short names, use self-descriptive names instead for python interfaces, unless they have clear meanings in their context.

  2. Avoid too many function arguments

    Less is better. If the C functions have too many function arguments, use keyword arguments with proper default values for optional ones in python.

  3. Handle errors in python

    Since C functions might exit (unfortunately) inside their functions for unexpected inputs, it should be check if the inputs are supported or not in python.

To wrap C interface, Cython is totally used.

How to build pysptk

You have to install numpy and cython first, and then:

git clone https://github.com/r9y9/pysptk
cd pysptk
git submodule update --init
python setup.py develop

should work.

Note

Dependency to the SPTK is added as a submodule. You have to checkout the supported SPTK as git sudmobule update --init before running setup.py.

How to build docs

pysptk docs are managed by the python sphinx. Docs-related dependencies can be resolved by:

pip install .[docs]

at the top of pysptk directory.

To build docs, go to the docs directory and then:

make html

You will see the generated docs in _build directory as follows (might different depends on sphinx version):

% tree _build/ -d
_build/
├── doctrees
│   └── generated
├── html
│   ├── _images
│   ├── _modules
│   │   └── pysptk
│   ├── _sources
│   │   └── generated
│   ├── _static
│   │   ├── css
│   │   ├── fonts
│   │   └── js
│   └── generated
└── plot_directive
    └── generated

See _build/html/index.html for the top page of the generated docs.

How to add a new function

There are a lot of functions unexposed from SPTK. To add a new function to pysptk, there are a few typical steps:

  1. Add function signature to _sptk.pxd
  2. Add cython implementation to _sptk.pyx
  3. Add python interface (with docstrings) to sptk.py (or some proper module)

As you can see in setup.py, _sptk.pyx and SPTK sources are compiled into a single extension module.

Note

You might wonder why cython implementation and python interface should be separated because cython module can be directly accessed by python. The reasons are 1) to avoid rebuilding cython module when docs strings are changed in the source 2) to make doc looks great, since sphinx seems unable to collect function argments correctly from cython module for now. Relevant issue: pysptk/#33

An example

In _sptk.pyd:

cdef extern from "SPTK.h":
    double _agexp "agexp"(double r, double x, double y)

In _sptk.pyx:

def agexp(r, x, y):
    return _agexp(r, x, y)

In sptk.pyx:

def agexp(r, x, y):
    """Magnitude squared generalized exponential function

    Parameters
    ----------
    r : float
        Gamma
    x : float
        Real part
    y : float
        Imaginary part

    Returns
    -------
    Value

    """
    return _sptk.agexp(r, x, y)

Indices and tables