Re: Multithreaded qsort(3)
- From: "Jeremy Messenger" <mezz7@xxxxxxx>
- Date: Thu, 15 Mar 2007 07:44:03 -0500
On Thu, 15 Mar 2007 03:42:21 -0500, Diomidis Spinellis <dds@xxxxxxx> wrote:
Greetings,
I've added multhread support to our qsort implementation. You can see the code at <http://people.freebsd.org/~dds/qsort_mt.c> and some performance figures at the end of this email. I propose to add it to our libc. I need your opinions on the interface, and also any comments you may have on the code. I see the following design dimensions:
That remind me other days when I have readed in one of blog that have some good info about sorts. I will posting a few of links in here in case if anyone is insteresting to read. ;-)
Quick Sort (qsort()):
http://jeffreystedfast.blogspot.com/2007/03/quick-sort.html
25 Byte Sort Routine:
http://jeffreystedfast.blogspot.com/2007/03/25-byte-sort-routine.html
Merge Sort:
http://jeffreystedfast.blogspot.com/2007/02/merge-sort.html
A lot of more sorts:
Shell Sort
Binary Insertion Sort
Insertion Sort
Bubble Sort
http://jeffreystedfast.blogspot.com/search/label/sorting
Cheers,
Mezz
1. Naming and availability
--------------------------
- Option 1.1: Replace qsort with this implementation
* Pros: Programs gain performance without any source code change; abstraction (number of processors/cores should be invisible, just as the CPU architecture or disk drive interface)
* Cons: May confuse multithreaded programs; programs require linking with pthreads library; programs whose compare function is not thread safe will break
- Option 1.2: Name it qsort_mt
* Pros: POLA; programs can tune the call according to their need
* Cons: Programs require source code changes; we violate abstraction; namespace polution
My proposal: option 1.2: name it qsort_mt and make it available in a separate library (libc_mt).
2. Interface
------------
- Option 2.1: qsort compatible
* Pros: Drop in replacement; doesn't need programmer understanding; compatible with option 1.1
* Cons: Can't be tuned for a specific program or workload
- Option 2.2: Add nthreads parameter, specifying the maximum number of threads
* Pros: Multithreaded programs can tune load balancing; all programs can optimise nthreads according to the workload characteristics.
* Cons: Libc routines are generaly expected to Do he Right Thing, without handholding; must agree on mechanism for determining the number of threads; the benefits of tuning are unlikely to be dramatic.
- Option 2.3: Add forkelem, specifying minimum number of elements for a new thread (below forkelem, I use single-threaded qsort).
* Pros: programs can optimise nthreads according to the workload characteristics.
* Cons: Libc routines are generaly expected to Do he Right Thing, without handholding; the benefits of tuning are unlikely to be dramatic.
- Option 2.4: Have the library tune itsself for a given system or individual programs by measuring different values
* Pros: higher performance when the tuned values are reused
* Cons: Lack of prior art; performance drop when determining tuning values; security considerations if values are cached; complexity
(Options 2.2 and 2.3 are orthogonal; options 2.1 and 2.4 are orthogonal)
My proposal: go for 2.1, based on the way C library routines work.
3. Determining the number of threads
------------------------------------
- Option 3.1: NPROCS environment variable
* Pros: prior art: CrayOS, BSD/OS; users can tune it.
* Cons: May be too blunt; requires setting by user or administrator
- Option 3.[234]: hw.ncpu or kern.smp.cpus sysctl, or pmc_ncpu()
* Pros: automatic, right for the underlying hardware
* Cons: even blunter
- Option 3.5: Add library interface for the above as int nthreads_mt()
* Pros: we abstract policy; programs can replace it; reusability; extensibility.
* Cons: namespace polution; it is not clear that a single number is correct for all uses of a program.
(All the above options are orthogonal)
My proposal: Add library interface using NPROCS and falling back to a sysctl variable. This interface can be extended in the future.
Performance figures
-------------------
Reported times are elapsed, user, and system seconds.
$ uname -a
FreeBSD sledge.freebsd.org 7.0-CURRENT FreeBSD 7.0-CURRENT #924: Wed Mar 14 14:25:07 UTC 2007 root@xxxxxxxxxxxxxxxxxx:/h/src/sys/amd64/compile/SLEDGE amd64
$ qsort_mt -?
qsort_mt: option requires an argument -- h
usage: qsort_mt [-stv] [-f forkelements] [-h threads] [-n elements]
-l Run the libc version of qsort
-s Test with 20-byte strings, instead of integers
-t Print timing results
-v Verify the integer results
Defaults are 1e7 elements, 2 threads, 100 fork elements
$ gcc -DTEST -O qsort_mt.c -o qsort_mt -lthr -lpmc
$ qsort_mt -t -h 2 -n 100000000 -l # Integers; qsort(3)
46.5 46.2 0.314
$ ./qsort_mt -t -h 2 -n 100000000 # Integers; qsort_mt
27.5 46.3 0.301
$ qsort_mt -t -h 2 -n 10000000 -s -l # Strings; qsort(3)
41.3 40.1 1.2
$ ./qsort_mt -t -h 2 -n 10000000 -s # Strings; qsort_mt
26.7 40.1 1.16
$ gcc -DTEST -O qsort_mt.c -o qsort_mt -lpthread -lpmc
$ qsort_mt -t -h 2 -n 100000000 # Integers; qsort_mt
28 47.1 0.368
$ qsort_mt -t -h 2 -n 10000000 -s # Strings; qsort_mt
27.1 40.6 1.06
Thanks,
Diomidis - dds@
http://www.spinellis.gr
--
mezz7@xxxxxxx - mezz@xxxxxxxxxxx
FreeBSD GNOME Team - FreeBSD Multimedia Hat (ports, not src)
http://www.FreeBSD.org/gnome/ - gnome@xxxxxxxxxxx
http://wiki.freebsd.org/multimedia - multimedia@xxxxxxxxxxx
_______________________________________________
freebsd-arch@xxxxxxxxxxx mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-arch
To unsubscribe, send any mail to "freebsd-arch-unsubscribe@xxxxxxxxxxx"
- References:
- Multithreaded qsort(3)
- From: Diomidis Spinellis
- Multithreaded qsort(3)
- Prev by Date: Re: Multithreaded qsort(3)
- Next by Date: Re: <sys/queue.h> bikeshed proposal
- Previous by thread: Re: Multithreaded qsort(3)
- Next by thread: Re: Multithreaded qsort(3)
- Index(es):
Relevant Pages
|
|