I recently lost my external HDD on which I had my entire karaoke library stored. Just under 2TB of files. I still have all my original online backups from my various providers, archives, and CD's, so I purchased another HDD and I went through the process of consolidating the library again(this time I will be backing it up on a NAS). At this point I had to let VDJ scan every track again, the process took days. I would love to have either a dedicated "database creation" mode, in which the program allocates all system resources to scanning and importing tracks, possibly implementing OpenCL as well. Or preferably a, stripped down, standalone app based on the existing file importation architecture, with the same goal in mind. Maybe even an android/iOS version for file prep, beatgridding, etc on the go.
Inviato Sun 20 Oct 19 @ 7:57 pm
Simply open virtualdj, don't load any tracks, start scanning and minimize virtualdj.
That should already be close to using most resources for scanning
That should already be close to using most resources for scanning
Inviato Mon 21 Oct 19 @ 4:42 am
Just sitting idle I'm getting 5-15% CPU consumption, and around a half GB of RAM committed. Which in normal circumstances is a drop in the bucket, but when you're dealing with huge batch jobs every little bit helps. My VJ rig has 2 pretty good graphics cards just twiddling their thumbs during database operations. Incorporating OpenCL/parallel processing and stripping off some of the inherent CPU overhead could give a big boost to the process. Especially since it's just the kind of job that OpenCL was meant to speed up.(lots of parallel threads), and it could make the results more consistent across different devices.(scanning a track on my Mac will often produce different results in BPM, Key, etc than the same file scanned on my PC). It would also be nice to have a mobile version. Possibly another premium app like VDJ remote, I know I would pay for it.
Inviato Mon 21 Oct 19 @ 5:33 pm
Related to this, I just opened a new thread asking for a configurable degree of parallelism in batch processing. Right now it seems to have a maximum of ten simultaneous batch operations and this is insufficient for many modern CPUs with high core counts. It's also a much quicker and easier fix than trying to writing and QA'ing a cross-platform compute shader for doing the analysis process (which I'm not sure is actually a problem that compute shaders are well-suited for due to the size of the input data vs. the amount of processing that needs to be done, and the lack of per-task parallelism involved).
Inviato Mon 21 Oct 19 @ 9:23 pm
With vdj minimized you should not see 10% idle cpu usage.
Also when checking cpu usage don't forget to take into account the frequency your cpu is running at. (5% at its lowest frequency would practically only be an overhead of 1 or 2% at the full frequency you'd be running at while scanning)
I don't think there's currently a hard limit to the number of simultaneous scans.
Vdj will only count physical cores though as last time I checked hyperthreading didn't improve scanning speed much.
What kind of cpu do you have?
Also when checking cpu usage don't forget to take into account the frequency your cpu is running at. (5% at its lowest frequency would practically only be an overhead of 1 or 2% at the full frequency you'd be running at while scanning)
I don't think there's currently a hard limit to the number of simultaneous scans.
Vdj will only count physical cores though as last time I checked hyperthreading didn't improve scanning speed much.
What kind of cpu do you have?
Inviato Tue 22 Oct 19 @ 1:12 am
I have 56 physical cores (112 threads with SMT) on my 2x Xeon Platinum 8276 workstation, and it absolutely does not scale over those; only 10 simultaneous operations at a time.
Inviato Tue 22 Oct 19 @ 2:32 pm
Checked and it is indeed limited to 8 simultaneous scans.
Not completely sure if it should be increased though.
At 8 scans of 320kbps audio on a fast cpu you're looking at 100MB/sec read-speed required, so about the limit of average regular hdd's. (Especially since it's spread over 8 files it will start to look like random access and even ssd speeds start to slow down)
With video files the reads become even more random, so on most systems above 8 simultaneous scans I doubt you'd see much of a speed improvement.
Not completely sure if it should be increased though.
At 8 scans of 320kbps audio on a fast cpu you're looking at 100MB/sec read-speed required, so about the limit of average regular hdd's. (Especially since it's spread over 8 files it will start to look like random access and even ssd speeds start to slow down)
With video files the reads become even more random, so on most systems above 8 simultaneous scans I doubt you'd see much of a speed improvement.
Inviato Tue 22 Oct 19 @ 2:52 pm
Up to you as to whether you think a recommended default of 8 is applicable because of the scans being IO bound when reading from mechanical disks, but I would certainly ask for it to be configurable for those of us with faster storage (e.g. NVMe SSDs) and more cores. For example, a single mid-range NVMe SSD has a sequential read throughput of around 30x your 100MB/s example, which certainly brings it back to being a CPU-bound problem on even the highest-end systems.
Inviato Tue 22 Oct 19 @ 11:55 pm
It would be nice to have a configurable metric to increase the amount of parallel scan operations. Most modern laptops, even the mid-to-low range models have SSD's now. My PC for mobile shows isn't that new, but the Samsung SSD's in it still reach over 2000mbp/s in practical testing. So throttling the simultaneous operations limit based on the 100mbp/s I/O of mechanical drives is kneecapping a majority of users I would think, given we are assuming the average VDJ user's storage I/O is the benchmark, rather than using the lowest common denominator. And if the feature were optional it wouldn't break compatibility for those users still operating on mechanical drives anyway.
Inviato Thu 24 Oct 19 @ 12:35 am
Could Adion or another member of the team please confirm whether or not the degree-of-parallelism config option is going to be made available? I'm currently shopping around for DJ software and this is a breaking issue for me.
Inviato Sat 02 Nov 19 @ 6:56 pm
It might, but I'd be interested to know your usecase if this is a breaking issue for you.
Normally you'd only have a big batch to scan the first time you start using virtualdj, and you could just leave it running overnight once.
After that I can't really imagine scan speed to be an issue unless you add thousands of files a week (but then how you ever have time to listen to them?)
Normally you'd only have a big batch to scan the first time you start using virtualdj, and you could just leave it running overnight once.
After that I can't really imagine scan speed to be an issue unless you add thousands of files a week (but then how you ever have time to listen to them?)
Inviato Sun 03 Nov 19 @ 6:17 am