I’m doing some tests for CPU and GPU environment usages for prediction (Predict.py).
I’m using an audio file Audio: mp3, 44100 Hz, stereo, fltp, 192 kb/s of duration 00:03:15.29
$ ffprobe /audio/12380187.mp3
ffprobe version 4.0 Copyright (c) 2007-2018 the FFmpeg developers
built with Apple LLVM version 9.1.0 (clang-902.0.39.1)
configuration: --prefix=/usr/local/Cellar/ffmpeg/4.0 --enable-shared --enable-pthreads --enable-version3 --enable-hardcoded-tables --enable-avresample --cc=clang --host-cflags= --host-ldflags= --enable-gpl --enable-libmp3lame --enable-libx264 --enable-libxvid --enable-opencl --enable-videotoolbox --disable-lzma
libavutil 56. 14.100 / 56. 14.100
libavcodec 58. 18.100 / 58. 18.100
libavformat 58. 12.100 / 58. 12.100
libavdevice 58. 3.100 / 58. 3.100
libavfilter 7. 16.100 / 7. 16.100
libavresample 4. 0. 0 / 4. 0. 0
libswscale 5. 1.100 / 5. 1.100
libswresample 3. 1.100 / 3. 1.100
libpostproc 55. 1.100 / 55. 1.100
Input #0, mp3, from '/audio/12380187.mp3':
Metadata:
encoder : Lavf56.40.101
Duration: 00:03:15.29, start: 0.025057, bitrate: 192 kb/s
Stream #0:0: Audio: mp3, 44100 Hz, stereo, fltp, 192 kb/s
Metadata:
encoder : Lavc56.60
On a Intel i7 - 12 core CPU the prediction time log says Completed after 0:03:19
$ time python Predict.py with cfg.full_44KHz input_path=/audio/12380187.mp3 output_path=/audio_sep/
Training full singing voice separation model, with difference output and input context (valid convolutions) and stereo input/output, and learned upsampling layer, and 44.1 KHz sampling rate
WARNING - Waveunet Prediction - No observers have been added to this run
INFO - Waveunet Prediction - Running command 'main'
INFO - Waveunet Prediction - Started
Producing source estimates for input mixture file /audio/12380187.mp3
Testing...
2018-11-20 14:54:05.306099: I tensorflow/core/platform/cpu_feature_guard.cc:140] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
Num of variables64
INFO:tensorflow:Restoring parameters from checkpoints/full_44KHz/full_44KHz-236118
INFO - tensorflow - Restoring parameters from checkpoints/full_44KHz/full_44KHz-236118
Pre-trained model restored for song prediction
INFO - Waveunet Prediction - Completed after 0:03:19
real 3m26.034s
user 13m30.420s
sys 4m40.200s
while on Intel Xeon 12 core plus 2x Nvidia GeForce GTX 1080 says Completed after 0:00:16
$ time python Predict.py with cfg.full_44KHz input_path=/audio/12380187.mp3
/usr/local/lib/python2.7/dist-packages/scikits/audiolab/soundio/play.py:48: UserWarning: Could not import alsa backend; most probably, you did not have alsa headers when building audiolab
warnings.warn("Could not import alsa backend; most probably, "
Training full singing voice separation model, with difference output and input context (valid convolutions) and stereo input/output, and learned upsampling layer, and 44.1 KHz sampling rate
WARNING - Waveunet Prediction - No observers have been added to this run
INFO - Waveunet Prediction - Running command 'main'
INFO - Waveunet Prediction - Started
Producing source estimates for input mixture file /audio/12380187.mp3
Testing...
2018-11-21 12:34:13.829481: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:898] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2018-11-21 12:34:13.830157: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1356] Found device 0 with properties:
name: GeForce GTX 1080 major: 6 minor: 1 memoryClockRate(GHz): 1.8475
pciBusID: 0000:01:00.0
totalMemory: 7.92GiB freeMemory: 7.46GiB
2018-11-21 12:34:13.961794: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:898] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2018-11-21 12:34:13.962562: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1356] Found device 1 with properties:
name: GeForce GTX 1080 major: 6 minor: 1 memoryClockRate(GHz): 1.8475
pciBusID: 0000:02:00.0
totalMemory: 7.93GiB freeMemory: 7.81GiB
2018-11-21 12:34:13.963292: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1435] Adding visible gpu devices: 0, 1
2018-11-21 12:34:14.531254: I tensorflow/core/common_runtime/gpu/gpu_device.cc:923] Device interconnect StreamExecutor with strength 1 edge matrix:
2018-11-21 12:34:14.531305: I tensorflow/core/common_runtime/gpu/gpu_device.cc:929] 0 1
2018-11-21 12:34:14.531329: I tensorflow/core/common_runtime/gpu/gpu_device.cc:942] 0: N Y
2018-11-21 12:34:14.531336: I tensorflow/core/common_runtime/gpu/gpu_device.cc:942] 1: Y N
2018-11-21 12:34:14.531830: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1053] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 7209 MB memory) -> physical GPU (device: 0, name: GeForce GTX 1080, pci bus id: 0000:01:00.0, compute capability: 6.1)
2018-11-21 12:34:14.589915: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1053] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:1 with 7543 MB memory) -> physical GPU (device: 1, name: GeForce GTX 1080, pci bus id: 0000:02:00.0, compute capability: 6.1)
Num of variables64
INFO:tensorflow:Restoring parameters from checkpoints/full_44KHz/full_44KHz-236118
INFO - tensorflow - Restoring parameters from checkpoints/full_44KHz/full_44KHz-236118
Pre-trained model restored for song prediction
INFO - Waveunet Prediction - Completed after 0:00:16
real 0m18.340s
user 0m15.972s
sys 0m5.528s
I’m not sure from logging if tensorflow is using both gpu devices or gpu 0 only. If I’m not wrong, most of the work is done in the Models.py here //github.com/f90/Wave-U-Net/blob/master/Models/UnetSpectrogramSeparator.py#L39 when the computation graph is calculated. I assume that these operations go on gpu:0 in this configuration, so gpu:1 will not be used - but I’m not sure of it.
Thank you very much!