The method provides a spectral speech description to be used for synthesis of a speech utterance, where at least one spectral envelope input representation is received. In one solution the improvement is made by manipulation an extremum, i.e. a peak or a valley, in the rapidly varying component of the spectral envelope representation. The rapidly varying component of the spectral envelope representation is manipulated to sharpen and/or accentuate extrema after which it is merged back with the slowly varying component or the spectral envelope input representation to create an enhanced spectral envelope final representation. In other solutions a complex spectrum envelope final representation is created with phase information derived from one of the group delay representation of a real spectral envelope input representation corresponding to a short-time speech signal and a transformed phase component of the discrete complex frequency domain input representation corresponding to the speech utterance.