theLink 10.0 NHI1 - theKernel - theLink - theConfig - theSq3Lite - theCompiler - theBrain - theGuard
c - tcl - py - rb - jv - cc
Loading...
Searching...
No Matches
README_PERFORMANCE

README_PERFORMANCE

The performance test is located at:

NHI1_HOME/performance

and is used to check the performance of the theLink project along with different programming languages.

announcement

‍The performance test has been revised and expanded.

  • The test now uses the installed-executables and not the build-executables.

The aim was to more precisely separate development and testing and thus better reproduce the results across system boundaries.
A strict distinction is now made between the executables in DEVELOPMENT

  • NHI1_BUILD/TARGET/SETUP

and the executables in INSTALLATION

  • NHI1_HOME/performance/gen/TARGET/SETUP/inst

files/directories used

gen ......................... place for generated output
gen/TARGET .................. TARGET specific results: x86_64-suse-linux-gnu, i686-suse-linux-gnu, ...
gen/total_link.perf ......... result table created by 'results.bash'
gen/TARGET/SETUP ............ SETUP specific results: perf-release or perf-aggressive
gen/TARGET/SETUP/*.perf ..... results of a specific test: c_pipe, tcl_uds_fork, ...
gen/TARGET/SETUP/inst ....... installation directory for *theLink* specific to TARGET & SETUP

TOOLS

TOOL DESCRIPTION
performance/Makefile.am The Makefile.am is used as wrapper for the build.bash and performance.bash commands to automate.
build.bash Config/Build/Install the 'performance' executables.
performance.bash Run the performance tests created previously with build.bash.
results.bash Collect and present the performance test results as table.
perfclient The client part of the performance test tool.
perfserver The server part of the performance test tool.

TESTS

The tests examine various aspects of the software:

  1. Basically, the performance test only starts when the perfserver has been started and initialized, i.e. a data packet is first sent to initialize all buffers and to possibly initiate the translation of code etc. in the Target-Programming-Language (TPL).
  2. A large part of the actual work takes place in the libraries of the NHI1 project, primarily theKernel and theLink.
  3. The MOT-Wrapper is used to test the integration of C into the Target-Programming-Language (TPL), whereby the callback performance and the "overhead" of the C / Target-Programming-Language (TPL) wrapper are relevant.
  4. For the startup test (--parent), it is relevant how quickly a thread or process can be created and initialized in the Target-Programming-Language (TPL).
--send-nothing

An empty packet is built, send and received without waiting for a response.

Python example for the service called at the perfserver

0 def NTHT (self):
99 pass
--send

Like --send-nothing but a packet with data is built, sent, received and read. The perfclient does not wait for a response.

The last object sent is a binary blob. This blob is read in perfserver as a buffer object (pointer value), so no additional storage is allocated.

python RDUL service

0 def RDUL(self):
46 self.ReadI8()
47 self.ReadI16()
48 self.ReadI32()
49 self.ReadDBL()
50 self.ReadBUF()
MK_I8 ReadI8()
--send-and-wait

Like --send, but the sender waits for a response, while waiting no another packet is sent.

python ECUL service

0 def ECUL(self):
37 self.SendSTART()
38 self.SendI8(self.ReadI8())
39 self.SendI16(self.ReadI16())
40 self.SendI32(self.ReadI32())
41 self.SendDBL(self.ReadDBL())
42 self.ProxyItem(self)
43 self.SendRETURN()
void SendSTART()
--send-and-callback

Like --send-and-wait, but the sender waits for a response, while waiting another packet is sent.

--send-persistent

Like --send-and-wait or --send-and-callback, but the packet is stored in a local database and only processed further when the local database has acknowledged the packet (topic: guaranteed delivery)

--parent

The creation of a new parent instance is tested. A parent instance is either a new process or thread.

To maximize performance, 5 workers (--wrk 5) are started in perfclient, which use MqLinkCreate to create as many new parent instances in the perfserver as possible.

|- worker#1 ...................... |- instance ...
|- worker#2 ...................... |- instance ...
perfclient -|- worker#3 .......... perfserver -| ...
|- worker#4 ...
|- worker#5

After a successful startup, the new instance is deleted with MqContextDelete.

--child
Like --parent but a child instance is created with MqLinkCreateChild.
--bus

Like --send-and-wait, but a new MkBufferStreamC instance is created on perfserver. A response is sent and processed by the perfclient.

python BUST service

def BUST (self):
bus = MkBufferStreamC.CreateTLS( "perfserver-BUST" )
while self.ReadItemExists():
bus.WriteBUF(self.ReadBUF())
bus.PosToStart()
self.SendSTART()
while bus.ReadItemExists():
self.SendBUF(bus.ReadBUF())
self.SendRETURN()
--bfl

Like --send-and-wait, but a new MkBufferListC instance is created on perfserver. A response is sent and processed by the perfclient.

python BFLT service

def BFLT (self):
bfl = MkBufferListC.CreateTLS( "perfserver-BFLT" )
while self.ReadItemExists():
bfl.AppendBUF(self.ReadBUF())
self.SendSTART()
for i in range(bfl.Size()):
self.SendBUF(bfl.IndexGet(i))
self.SendRETURN()

ANALYSIS

PARENT with PIPE or SPAWN

The --parent test is the startup performance test of a process, i.e. starting an executable on the file system. This test is also a test of the speed of the file system for a fast language like C or C++. A difference of a factor of 10 between a mount and a local file system is possible.

‍I choose the in-memory BUILD file system as the binary location, which is ~1% slower than my local file system.

TCL

setup

  • release has --spawn and --thread as startup and is build with shared libraries.
  • aggressive has --spawn and --fork as startup and is build with static libraries.
  • a new --thread (release) parent is started as a standalone interpreter in the same process.

--send-X

  • the speed is slower than Python.
  • the speed with thread (release) is much slower than the speed with static (aggressive).
  • the --send-nothing feature has shown that a threaded TCL interpreter alone costs 40% of the performance, even if almost all the work is done in the theLink library.

--parent

  • the start is slow with --spawn and --thread and normal with --fork.
  • since starting with -spawn is only a little slower than with --thread, it can be assumed that initializing the interpreter has a significant impact on startup time.
PYTHON

setup

  • release and aggressive have --spawn and --fork as startup.
  • release is built with shared libraries and aggressive is built with static libraries.
  • although the interpreter is built with thread support by default, theLink does not use it.

--send-X

  • although thread support is always enabled in the interpreter, there is, in contrast to TCL, almost zero overhead.
  • the --send-XXX speed is fast, close to C and C++.

--parent

  • interpreter startup is slower than TCL and much slower in --fork, which is because the interpreter must be cleaned up before the fork. (thread penalty?).
  • The poor --fork performance is a real handicap for using Python as an application server.
RUBY
read more at RUBY performance analysis
C/C++

setup

  • release and aggressive have --spawn, --fork and --thread as startup.
  • release is built with shared libraries and aggressive is built with static libraries.
  • the visual difference between C and C++ is that C++ requires an additional library: libstdc++.so.6.

--send-X

  • the send speed is almost identical for C and C++, even between release and aggressive.
  • C++ has a very small disadvantage of about 1% and for --thread of about 2%.

--parent

  • the biggest difference between C and C++ is the startup.
    • C++ has a disadvantage of about 10% for --pipe and --spawn and about 20% for --fork.
    • the --thread startup is almost identical between C and C++.
JAVA

setup

  • Java has no support for *aggressive because Java requires thread support, which then also disables --fork support.
  • The release has --spawn and --thread support and is always build with shared libraries.

--send-X

  • The send performance is below average for a compiled language and is significantly lower than the values for C and C++.

--parent

  • Startup performance is poor for a compiled language, well below that of C and C++ and even below that of a "scripting" language like TCL or Python.
  • The strength of Java is clearly the startup performance for thread, which stands out from all languages without thread support but is still worse than TCL with thread support.

RESULTS

For details on testing read the performance.bash documentation.

abbreviations
info test example
R|A perf-release versa perf-aggressive
perf-release build with: shared, thread and normal optimzitation
perf-aggressive build with: static, no-thread and aggressive optimzitation
pipe perfclient ... @ perfserver ...
tcp|uds perfserver --tcp|uds ... & ... perfclient --tcp|uds ...
thread|fork|spawn perfserver --thread|fork|spawn ... & ... perfclient ...
total_link.perf

 x86_64-suse-linux-gnu     |   send     send     send     send    create    create    data     data  
 2024-11-01 08:30:00       |  NOTHING   END    CALLBACK   WAIT    PARENT    CHILD     BUS      BFL   
 ------------------------- | -------- -------- -------- -------- --------- -------- -------- --------

 pipe:
 R: C                      |   540007   402644   233921    91004     3923    38406    90085    89640 
 R: C++                    |   527056   387423   215244    88570     2497    36541    88042    88582 
 R: Python                 |   493313   315040   160869    75802      103    21982    68504    65800 
 R: Tcl                    |   332380   190834   120565    61112      132    23589    43077    42926 
 R: Java                   |   468695   299161   164803    76668       70    19256    70039    69950 
 R: Ruby                   |   436564   301587   165921    77032       52    16330    71040    63967 
 A: C                      |   534459   406036   239231    95162     5130    40560    95063    94679 
 A: C++                    |   530273   397978   226740    93654     3220    38622    93087    92784 
 A: Python                 |   493959   330869   172690    80871      113    23198    74176    72165 
 A: Tcl                    |   427613   247405   137936    70103      132    24716    48329    47938 
 A: Java                   |      na.      na.      na.      na.      na.      na.      na.      na. 
 A: Ruby                   |   474768   339215   179359    82239       57    16433    74008    66150 


 x86_64-suse-linux-gnu     |   send     send     send     send    create    create    data     data  
 2024-11-01 08:30:00       |  NOTHING   END    CALLBACK   WAIT    PARENT    CHILD     BUS      BFL   
 ------------------------- | -------- -------- -------- -------- --------- -------- -------- --------

 uds_fork:
 R: C                      |   538941   401696   222415    90240    12129    38597    89482    89265 
 R: C++                    |   523566   383756   212761    88028     9146    32236    71409    86209 
 R: Python                 |   478051   303102   157156    74471      345    21968    68123    65706 
 R: Tcl                    |      na.      na.      na.      na.      na.      na.      na.      na. 
 R: Java                   |      na.      na.      na.      na.      na.      na.      na.      na. 
 R: Ruby                   |   462675   326942   170818    78210     2769    16522    71408    64534 
 A: C                      |   536390   403260   238588    95014    14360    40744    94798    94384 
 A: C++                    |   528386   399932   229342    83742    11156    39144    92758    87008 
 A: Python                 |   497722   328038   170178    78212      361    23047    75014    72812 
 A: Tcl                    |   418773   244119   137038    69705     5580    24707    48258    47463 
 A: Java                   |      na.      na.      na.      na.      na.      na.      na.      na. 
 A: Ruby                   |   461779   339687   181173    82398     2714    16882    75640    68845 


 x86_64-suse-linux-gnu     |   send     send     send     send    create    create    data     data  
 2024-11-01 08:30:00       |  NOTHING   END    CALLBACK   WAIT    PARENT    CHILD     BUS      BFL   
 ------------------------- | -------- -------- -------- -------- --------- -------- -------- --------

 uds_thread:
 R: C                      |   504425   375809   212943    88359    32173    37978    87952    88606 
 R: C++                    |   494135   365464   205933    88317    31582    35011    88070    73875 
 R: Python                 |      na.      na.      na.      na.      na.      na.      na.      na. 
 R: Tcl                    |   390334   224660   123795    63402      139    24137    44490    38410 
 R: Java                   |   463538   309542   161559    79059    19282    19779    72112    71591 
 R: Ruby                   |      na.      na.      na.      na.      na.      na.      na.      na. 
 A: C                      |      na.      na.      na.      na.      na.      na.      na.      na. 
 A: C++                    |      na.      na.      na.      na.      na.      na.      na.      na. 
 A: Python                 |      na.      na.      na.      na.      na.      na.      na.      na. 
 A: Tcl                    |      na.      na.      na.      na.      na.      na.      na.      na. 
 A: Java                   |      na.      na.      na.      na.      na.      na.      na.      na. 
 A: Ruby                   |      na.      na.      na.      na.      na.      na.      na.      na. 


 x86_64-suse-linux-gnu     |   send     send     send     send    create    create    data     data  
 2024-11-01 08:30:00       |  NOTHING   END    CALLBACK   WAIT    PARENT    CHILD     BUS      BFL   
 ------------------------- | -------- -------- -------- -------- --------- -------- -------- --------

 uds_spawn:
 R: C                      |   539158   401000   234392    90476     3855    38965    90106    90331 
 R: C++                    |   527303   383711   206538    87939     2496    28590    77340    84945 
 R: Python                 |   486481   308086   154854    73834      101    22216    68473    66417 
 R: Tcl                    |   395278   215695   121196    62226      133    23641    43273    36560 
 R: Java                   |   466043   298337   165145    78412       70    15499    53845    53859 
 R: Ruby                   |   424367   308114   167472    77792       53    15817    67069    48598 
 A: C                      |   524604   403168   236936    94557     4936    40592    94800    94658 
 A: C++                    |   530558   397200   227054    94264     3200    37489    92515    92984 
 A: Python                 |   496188   330258   171413    80571      112    23057    74488    72380 
 A: Tcl                    |   411165   241968   133288    60287      132    24436    47517    46998 
 A: Java                   |      na.      na.      na.      na.      na.      na.      na.      na. 
 A: Ruby                   |   435118   331786   177345    81311       56    16716    74470    66146 
perf-release libraries used
command: ./build.bash perf-release test
stdin
setup=perf-release
  action=test
  call actionTest: --enable-static=no --enable-threads=yes --enable-debug=no --with-python --with-cxx --with-tcl --with-ruby --with-libsqlite3 --with-libconfig --without-perl --without-php --without-go --disable-brain --disable-guard --without-csharp --with-java --without-vb
  ./build.bash: Zeile 119: /dev/shm/dev1usr/Main/x86_64-suse-linux-gnu/debug/performance/inst/x86_64-suse-linux-gnu/perf-release/inst/libexec/NHI1/x86_64-suse-linux-gnu-env-inst.sh: Datei oder Verzeichnis nicht gefunden
perf-aggressive libraries used
command: ./build.bash perf-aggressive test
stdin
setup=perf-aggressive
  action=test
  call actionTest: --enable-static=yes --enable-threads=no --enable-debug=no --with-python --with-cxx --with-tcl --with-ruby --with-libsqlite3 --with-libconfig --without-perl --without-php --without-go --disable-brain --disable-guard
  ./build.bash: Zeile 119: /dev/shm/dev1usr/Main/x86_64-suse-linux-gnu/debug/performance/inst/x86_64-suse-linux-gnu/perf-aggressive/inst/libexec/NHI1/x86_64-suse-linux-gnu-env-inst.sh: Datei oder Verzeichnis nicht gefunden