2022.05.24 - [개발 이야기/Python] - [코딩 테스트] 파이썬 코딩테스트 핵심 요약 (CheatSheet) - 코테 1시간전에 꼭 보자.
2022.05.08 - [개발 이야기/Python] - [음성인식 - 6라인] 가장 쉬운 음성인식 (STT) 해 보기
2022.04.30 - [개발 이야기] - [코테] 코딩 테스트 플랫폼 4종 - 백준, 리트코드, 프로그래머스, 코드시그널
2020.12.16 - [분류 전체보기] - [개발] 피보나치(Fibonacci) 수열 구현 7가지 방법 - 파이썬 실습/확인 바로하기
2020.05.09 - [개발 이야기] - [개발] 파이썬 문법 5분만에 읽히기 - 파이썬 기본 문법 요약/정리 8 가지
2018.03.03 - [개발 이야기/Python] - 피보나치(Fibonacci) 수열을 구현하는 7가지 방법 - 파이썬(Python) 피보나치 구현 7선
import IPython.display as ipd
ipd.Audio('test.wav') # load a local WAV file
import numpy
sr = 22050 # sample rate
T = 2.0 # seconds
t = numpy.linspace(0, T, int(T*sr), endpoint=False) # time variable
x = 0.5*numpy.sin(2*numpy.pi*440*t) # pure sine wave at 440 Hz
ipd.Audio(x, rate=sr) # load a NumPy array
아래와 같이 수많은 Feature 추출을 간편하게 수행할 수 있다.
import librosa
x, sr = librosa.load('audio/simple_loop.wav')
import matplotlib.pyplot as plt
import librosa.display
plt.figure(figsize=(14, 5))
librosa.display.waveplot(x, sr=sr)
X = librosa.stft(x)
Xdb = librosa.amplitude_to_db(abs(X))
plt.figure(figsize=(14, 5))
librosa.display.specshow(Xdb, sr=sr, x_axis='time', y_axis='hz')
import matplotlib.pyplot as plt
y, sr = librosa.load(librosa.ex('choice'), duration=15)
fig, ax = plt.subplots(nrows=2, ncols=1, sharex=True)
D = librosa.amplitude_to_db(np.abs(librosa.stft(y)), ref=np.max)
img = librosa.display.specshow(D, y_axis='linear', x_axis='time',
sr=sr, ax=ax[0])
ax[0].set(title='Linear-frequency power spectrogram')
ax[0].label_outer()
hop_length = 1024
D = librosa.amplitude_to_db(np.abs(librosa.stft(y, hop_length=hop_length)),
ref=np.max)
librosa.display.specshow(D, y_axis='log', sr=sr, hop_length=hop_length,
x_axis='time', ax=ax[1])
ax[1].set(title='Log-frequency power spectrogram')
ax[1].label_outer()
fig.colorbar(img, ax=ax, format="%+2.f dB")
import soundfile as sf
data, samplerate = sf.read('existing_file.wav')
sf.write('new_file.flac', data, samplerate)
import soundfile as sf
with sf.SoundFile('myfile.wav', 'r+') as f:
while f.tell() < f.frames:
pos = f.tell()
data = f.read(1024)
f.seek(pos)
f.write(data*2)
import parselmouth
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
sns.set() # Use seaborn's default style to make attractive graphs
plt.rcParams['figure.dpi'] = 100 # Show nicely large images in this notebook
snd = parselmouth.Sound("audio/the_north_wind_and_the_sun.wav")
intensity = snd.to_intensity()
spectrogram = snd.to_spectrogram()
plt.figure()
draw_spectrogram(spectrogram)
plt.twinx()
draw_intensity(intensity)
plt.xlim([snd.xmin, snd.xmax])
plt.show()
from parselmouth.praat import call
manipulation = call(sound, "To Manipulation", 0.01, 75, 600)
smile = opensmile.Smile(
feature_set=opensmile.FeatureSet.eGeMAPSv02,
feature_level=opensmile.FeatureLevel.Functionals,
)
smile.feature_names
다음은 opensmile에 사전 정의된 (predefined) 음성 특성변수 (Feature) 들이다.
['F0semitoneFrom27.5Hz_sma3nz_amean',
'F0semitoneFrom27.5Hz_sma3nz_stddevNorm',
'F0semitoneFrom27.5Hz_sma3nz_percentile20.0',
'F0semitoneFrom27.5Hz_sma3nz_percentile50.0',
'F0semitoneFrom27.5Hz_sma3nz_percentile80.0',
'F0semitoneFrom27.5Hz_sma3nz_pctlrange0-2',
'F0semitoneFrom27.5Hz_sma3nz_meanRisingSlope',
'F0semitoneFrom27.5Hz_sma3nz_stddevRisingSlope',
'F0semitoneFrom27.5Hz_sma3nz_meanFallingSlope',
'F0semitoneFrom27.5Hz_sma3nz_stddevFallingSlope',
'loudness_sma3_amean',
'loudness_sma3_stddevNorm',
'loudness_sma3_percentile20.0',
'loudness_sma3_percentile50.0',
'loudness_sma3_percentile80.0',
'loudness_sma3_pctlrange0-2',
'loudness_sma3_meanRisingSlope',
'loudness_sma3_stddevRisingSlope',
'loudness_sma3_meanFallingSlope',
'loudness_sma3_stddevFallingSlope',
'spectralFlux_sma3_amean',
'spectralFlux_sma3_stddevNorm',
'mfcc1_sma3_amean',
'mfcc1_sma3_stddevNorm',
'mfcc2_sma3_amean',
'mfcc2_sma3_stddevNorm',
'mfcc3_sma3_amean',
'mfcc3_sma3_stddevNorm',
'mfcc4_sma3_amean',
'mfcc4_sma3_stddevNorm',
'jitterLocal_sma3nz_amean',
'jitterLocal_sma3nz_stddevNorm',
'shimmerLocaldB_sma3nz_amean',
'shimmerLocaldB_sma3nz_stddevNorm',
'HNRdBACF_sma3nz_amean',
'HNRdBACF_sma3nz_stddevNorm',
'logRelF0-H1-H2_sma3nz_amean',
'logRelF0-H1-H2_sma3nz_stddevNorm',
'logRelF0-H1-A3_sma3nz_amean',
'logRelF0-H1-A3_sma3nz_stddevNorm',
'F1frequency_sma3nz_amean',
'F1frequency_sma3nz_stddevNorm',
'F1bandwidth_sma3nz_amean',
'F1bandwidth_sma3nz_stddevNorm',
'F1amplitudeLogRelF0_sma3nz_amean',
'F1amplitudeLogRelF0_sma3nz_stddevNorm',
'F2frequency_sma3nz_amean',
'F2frequency_sma3nz_stddevNorm',
'F2bandwidth_sma3nz_amean',
'F2bandwidth_sma3nz_stddevNorm',
'F2amplitudeLogRelF0_sma3nz_amean',
'F2amplitudeLogRelF0_sma3nz_stddevNorm',
'F3frequency_sma3nz_amean',
'F3frequency_sma3nz_stddevNorm',
'F3bandwidth_sma3nz_amean',
'F3bandwidth_sma3nz_stddevNorm',
'F3amplitudeLogRelF0_sma3nz_amean',
'F3amplitudeLogRelF0_sma3nz_stddevNorm',
'alphaRatioV_sma3nz_amean',
'alphaRatioV_sma3nz_stddevNorm',
'hammarbergIndexV_sma3nz_amean',
'hammarbergIndexV_sma3nz_stddevNorm',
'slopeV0-500_sma3nz_amean',
'slopeV0-500_sma3nz_stddevNorm',
'slopeV500-1500_sma3nz_amean',
'slopeV500-1500_sma3nz_stddevNorm',
'spectralFluxV_sma3nz_amean',
'spectralFluxV_sma3nz_stddevNorm',
'mfcc1V_sma3nz_amean',
'mfcc1V_sma3nz_stddevNorm',
'mfcc2V_sma3nz_amean',
'mfcc2V_sma3nz_stddevNorm',
'mfcc3V_sma3nz_amean',
'mfcc3V_sma3nz_stddevNorm',
'mfcc4V_sma3nz_amean',
'mfcc4V_sma3nz_stddevNorm',
'alphaRatioUV_sma3nz_amean',
'hammarbergIndexUV_sma3nz_amean',
'slopeUV0-500_sma3nz_amean',
'slopeUV500-1500_sma3nz_amean',
'spectralFluxUV_sma3nz_amean',
'loudnessPeaksPerSec',
'VoicedSegmentsPerSec',
'MeanVoicedSegmentLengthSec',
'StddevVoicedSegmentLengthSec',
'MeanUnvoicedSegmentLength',
'StddevUnvoicedSegmentLength',
'equivalentSoundLevel_dBp']
[음성인식 - 6라인] 가장 쉬운 음성인식 (STT) 해 보기 (0) | 2022.05.08 |
---|---|
[Python] 통계 대표값 (Mean, Median, Mode) 구하기 - 패키지 사용 vs. 패키지 미사용 (0) | 2022.05.07 |
파이썬 오디오 라이브러리 Top 5종 (Python Audio Library ) (0) | 2021.12.16 |
[python] PIP 버그 / PIP 오류 해결 (MacOS에서 pip 21.1.x 버그) (2) | 2021.07.16 |
[파이썬] IDE 없이 블로그에서 Python 바로 실습/공부 (0) | 2020.12.05 |
[파이썬] 원격 주피터 노트북 만들기 - AWS에서 나만의 Remote Jupyter Notebook을 구동해 보자. (4) | 2020.06.04 |
댓글 영역