AES-NI SSL Performance Study @ (2024)

homersssearchJuly 01, 2024

a study of AES-NI acceleration using LibreSSL, OpenSSL

The Advanced Encryption Standard Instruction Set (AES-NI) is an extension tothe x86 architecture for microprocessors from Intel and AMD. The purpose ofAES-NI is to improve the speed of applications performing encryption anddecryption using the Advanced Encryption Standard (AES) like the AES-128 andAES-256 ciphers. AES-NI was designed to provide 4x to 8x speed improvementswhen using AES ciphers for bulk data encryption and decryption.

AES accelerated CPUs can increase efficiency and performance when setting upan SSL Terminator for your HTTP web cluster, a VPN link, a sshfs file systemmount or moving bulk data over an SSH connection using scp or rsync.

The following table lists the results of a quick study of various ciphersused on desktop, laptop and mobile devices. The benchmarks focus on the ciphersavailable to TLS v1.2 and TLS v1.3 connections made by HTTP/2 , HTTPS clients.The ChaCha20 cipher is used as our baseline. ChaCha20 is a 256 bit streamcipher which is not AES accelerated and relies on raw CPU processing power. Theother ciphers are 128 bit and 256 bit AES ciphers which are accelerated by theCPU through AES-NI when AES-NI is enabled through the BIOS. LibreSSL (OpenSSL)is used to test all ciphers on various CPUs we have access to. All numbers arein Megabytes per Second (MB/s) per single CPU core. Higher values arebetter.

Cipher Performance per CPU core

 AES Performance per CPU core for TLS v1.2 Ciphers -- Higher Score is Better, Speeds are in Megabytes per Second per CPU core -- ChaCha20 AES-128-GCM AES-256-GCM AES-128-CBC AES-256-CBC Total ScoreIntel Gold 5412U 719 3321 2957 1885 1381 = 10263AMD Ryzen 7 1800X 573 3006 2642 1513 1101 = 8835Intel W-2125 565 2808 2426 1698 1235 = 8732Intel i7-6700 585 2607 2251 1561 1131 = 8135Intel Silver 4410Y 519 2386 2123 1353 992 = 7373Intel Gold 5217 598 2344 2018 1396 1014 = 7370AMD EPYC 7702 410 2464 2175 1241 904 = 7194Intel Silver 4215 566 2218 1910 1324 963 = 6981AMD EPYC 7551 355 2213 1962 1114 811 = 6455AMD EPYC 7402P 493 2478 2184 1244 907 = 6062Intel i5-6500 410 1729 1520 1078 783 = 5520Intel i7-4750HQ 369 1556 1353 688 499 = 4465AMD FX 8350 367 1453 1278 716 514 = 4328AMD FX 8150 347 1441 1273 716 515 = 4292Intel E5-2650 v4 404 1479 1286 652 468 = 4289Intel i7-2700K 382 1353 1212 763 552 = 4262Intel i7-3840QM 373 1279 1143 725 520 = 4040Intel i5-2500K 358 1274 1140 728 522 = 4022AMD FX 6100 326 1344 1186 671 481 = 4008AMD A10-7850K 321 1303 1176 685 499 = 3984AMD A8-7600 Kaveri 306 1246 1108 648 470 = 3778Intel E5-2640 v3 303 1286 1126 585 419 = 3719AMD Opteron 6380 293 1203 1063 589 423 = 3571AMD Opteron 6378 282 1138 986 561 406 = 3373AMD Opteron 6274 232 1054 926 524 376 = 3112Intel Xeon E5-2630 247 962 864 541 394 = 3008Intel Xeon E5645 262 817 717 727 524 = 3047Intel i7-2635QM 151 989 881 564 404 = 2989Intel Xeon L5630 225 701 610 626 450 = 2612Intel E5-2603 v4 236 866 754 382 274 = 2512AMD Opteron 2382 249 651 485 215 150 = 1750Intel i7-950 401 256 218 358 257 = 1490Intel Xeon X5550 287 205 175 305 219 = 1191AMD Phenom 965 404 84 63 282 198 = 1031Intel Core2 Q9300 231 126 133 221 161 = 872AMD X4 610e 225 59 44 198 139 = 665Intel Core2 Q6600 173 141 79 108 77 = 578Intel P4 3Ghz Will 109 26 23 55 43 = 256Intel ATOM D525 98 51 43 28 20 = 240Snapdragon S4 Pro 131 41 - - - = 172ARM Cortex A9 73 24 - - - = 97Testing Notes: AES-NI acceleration enabled if supported by BIOS and CPU Speeds in megabytes per second (MB/s) per real cpu core 8192 byte blocks Five(5) test runs, the average speed reported Snapdragon and ARM Cortex values reported by Google Developers

How do I interpret the results ?

Let's say we have a project with a 10 gigabit connection to the internet. 10gigabits per second is 1,250 megabytes per second. The web page designers areexpecting the web server to concurrently encrypt and decrypt enough data tosaturate the 10 gigabit connection. Let's also say 100% of our clients areusing the AES-128-GCM based cipher just to make it easier to compare numbersfrom the table above.

We will need a CPU which can processes 1,250 MB/s of AES encrypted data percpu core. Since we need to recieve (decrypt) and send (encrypt) data, the CPUshould support at least two(2) CPU cores, each able to sustain 1,250 MB/s. Fromthe test results above, any of the CPUs starting with the "AMD Opteron 6380"and faster would work perfectly as the "AMD Opteron 6380" can process 1,203megabytes per second of AES data per CPU core. Note that the AMDOpteron 6380 is a 16 core CPU which leaves plenty of other CPU cores to doother work like network I/O, firewall rules or ZFS file system work.

In the real world the situation would be more complicated. Clients connectwith a variety of ciphers and the system is not dedicated to just cipherprocessing. It is also possible that the cipher processing of multiple cpucores can be added together to reach the desired speed. The "Intel Xeon L5630"has four cores and each core could processes 701 MB/s of AES data for a around2,804 MB/s; just enough speed for encrypting and decrypting data on a 10gigabit link using AES-128-GCM.

Note that AES-NI is only supported by real CPU cores and not hyper threaded(HT) or virtual cores.

Check out our H2O and Nginxtutorials for tips on configuring a fast and secure web server or SSLterminator.

How can I test my own CPU ?

Using the following commands, download and build LibreSSL. The build processstatically builds the LibreSSL binaries and libraries in the local directory.No files are installed to the system. Once the build is done, run each of thecipher speed tests with a 10 second sleep in between to make sure the load ofthe machine reached zero(0). When you are done testing, delete the builddirectory and everything is cleaned up.

# NOTE: Ubuntu requires GCC and GNU Make to compile libressl# sudo apt install gcc makecd /tmpwget -4 zxvf libressl-3.9.2.tar.gzcd libressl-3.9.2./configure && make && echo SUCCESS./apps/openssl/openssl speed -elapsed -evp chacha sleep 10./apps/openssl/openssl speed -elapsed -evp aes-128-gcm sleep 10./apps/openssl/openssl speed -elapsed -evp aes-256-gcm sleep 10./apps/openssl/openssl speed -elapsed -evp aes-128-cbc sleep 10./apps/openssl/openssl speed -elapsed -evp aes-256-cbc echo FINISHED

Cipher Speed Test Output Example

The LibreSSL (OpenSSL) cipher speed test will print out a few lines ofoutput per test performed. The value we are interested in is on the last lineunder the label "8192 bytes". Our interests are focused on bulk data transfers and"8192 bytes" is the largest block test shown. The "8192 bytes" value is theamount of data the CPU can process using the cipher specified in thousands ofbytes per second. Divide the value shown by one(1) thousand to get megabytesper second which is the same as our results in the table above.

# use dmesg and search for the cpu type. for example, $ dmesg | grep CPU0[ 0.120426] smpboot: CPU0: Intel(R) Core(TM) i7-6700 CPU @ 3.40GHz (fam: 06, model: 5e, stepping: 03)# run the series of cipher speed tests, chacha is first...$ ./apps/openssl/openssl speed -elapsed -evp chachaYou have chosen to measure elapsed time instead of user CPU time.Doing chacha for 3s on 16 size blocks: 66892965 chacha's in 3.00sDoing chacha for 3s on 64 size blocks: 25017290 chacha's in 3.00sDoing chacha for 3s on 256 size blocks: 6502076 chacha's in 3.00sDoing chacha for 3s on 1024 size blocks: 1692776 chacha's in 3.00sDoing chacha for 3s on 8192 size blocks: 214511 chacha's in 3.00sThe 'numbers' are in 1000s of bytes per second processed.type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 byteschacha 356762.48k 533702.19k 554843.82k 577800.87k 585758.04k <----... the result is 585758.04k / 1000 = 585 MB/s$ ./apps/openssl/openssl speed -elapsed -evp aes-128-gcmYou have chosen to measure elapsed time instead of user CPU time.Doing aes-128-gcm for 3s on 16 size blocks: 134661060 aes-128-gcm's in 3.00sDoing aes-128-gcm for 3s on 64 size blocks: 79432576 aes-128-gcm's in 3.00sDoing aes-128-gcm for 3s on 256 size blocks: 28895019 aes-128-gcm's in 3.00sDoing aes-128-gcm for 3s on 1024 size blocks: 7559486 aes-128-gcm's in 3.00sDoing aes-128-gcm for 3s on 8192 size blocks: 954887 aes-128-gcm's in 3.00sThe 'numbers' are in 1000s of bytes per second processed.type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytesaes-128-gcm 718192.32k 1694561.62k 2465708.29k 2580304.55k 2607478.10k <----... the result is 2607478.10k / 1000 = 2,607 MB/s$ ./apps/openssl/openssl speed -elapsed -evp aes-256-gcmYou have chosen to measure elapsed time instead of user CPU time.Doing aes-256-gcm for 3s on 16 size blocks: 125601150 aes-256-gcm's in 3.00sDoing aes-256-gcm for 3s on 64 size blocks: 75507034 aes-256-gcm's in 3.00sDoing aes-256-gcm for 3s on 256 size blocks: 25591359 aes-256-gcm's in 3.00sDoing aes-256-gcm for 3s on 1024 size blocks: 6547497 aes-256-gcm's in 3.00sDoing aes-256-gcm for 3s on 8192 size blocks: 824454 aes-256-gcm's in 3.00sThe 'numbers' are in 1000s of bytes per second processed.type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytesaes-256-gcm 669872.80k 1610816.73k 2183795.97k 2234878.98k 2251309.06k <----... the result is 2251309.06k / 1000 = 2,251 MB/s$ ./apps/openssl/openssl speed -elapsed -evp aes-128-cbcYou have chosen to measure elapsed time instead of user CPU time.Doing aes-128-cbc for 3s on 16 size blocks: 250707357 aes-128-cbc's in 3.00sDoing aes-128-cbc for 3s on 64 size blocks: 71204109 aes-128-cbc's in 3.00sDoing aes-128-cbc for 3s on 256 size blocks: 18108237 aes-128-cbc's in 3.00sDoing aes-128-cbc for 3s on 1024 size blocks: 4563775 aes-128-cbc's in 3.00sDoing aes-128-cbc for 3s on 8192 size blocks: 571798 aes-128-cbc's in 3.00sThe 'numbers' are in 1000s of bytes per second processed.type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytesaes-128-cbc 1337105.90k 1519020.99k 1545236.22k 1557768.53k 1561389.74k <----... the result is 1561389.74k / 1000 = 1,561 MB/s$ ./apps/openssl/openssl speed -elapsed -evp aes-256-cbcYou have chosen to measure elapsed time instead of user CPU time.Doing aes-256-cbc for 3s on 16 size blocks: 185732038 aes-256-cbc's in 3.00sDoing aes-256-cbc for 3s on 64 size blocks: 51745988 aes-256-cbc's in 3.00sDoing aes-256-cbc for 3s on 256 size blocks: 13073843 aes-256-cbc's in 3.00sDoing aes-256-cbc for 3s on 1024 size blocks: 3280738 aes-256-cbc's in 3.00sDoing aes-256-cbc for 3s on 8192 size blocks: 414517 aes-256-cbc's in 3.00sThe 'numbers' are in 1000s of bytes per second processed.type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytesaes-256-cbc 990570.87k 1103914.41k 1115634.60k 1119825.24k 1131907.75k <----... the result is 1131907.75k / 1000 = 1,131 MB/s


Is OpenSSL faster than LibreSSL ?

Yes, both OpenSSL and BoringSSLare significantly faster than LibreSSL when using modern ciphers. LibreSSL isprobibly slower due to more locking, no internal crypto devices and singlethreaded processes with the idea of being more secure. The following windowshows a performance query using the elapsed speed tests built into both OpenSSLand LibreSSL. The server has a moderately powerful CPU with AES-NI enabled inthe BIOS. The machine is setup with an Intel i5-6500 CPU, FreeBSD 11, withLibreSSL v3.0.1 and OpenSSL v1.1.1a built from source. The results show thatOpenSSL is between 2.3x to 6.7x times faster than LibreSSL using ChaCha20 aswell as AES-128-GCM and AES-256-GCM. This performance difference is greatenough that you would need multiple https servers running Nginx built with LibreSSL to equal the speed of one(1)Nginx server built with OpenSSL.

Tip: take a look at the Nginx server resource sizing guide for deployingNginx on bare metal servers and the Nginxtesting methodology. The guide shows graduated hardware configurations andhow many requests per second, transactions per second and total throughput anhttps server could achieve.

 AES Performance per CPU core for TLS v1.2 Ciphers (Higher is Better, Speeds in Megabytes per Second) ChaCha20 AES-128-GCM AES-256-GCM AES-128-CBC AES-256-CBC Total ScoreIntel i5-6500 2762 4900 3554 1067 780 = 13063 OpenSSL v1.1.1a 1760 4455 3370 460 402 = 10447 BoringSSL v2017_12 410 1729 1520 1078 783 = 5520 LibreSSL v3.0.1################## Testing Results #####################dmesg | grep -i CPU CPU: Intel(R) Core(TM) i5-6500 CPU @ 3.20GHz (3192.14-MHz K8-class CPU)cd /tmpwget zxvf libressl-3.0.1.tar.gzcd libressl-3.0.1./configure && make && echo SUCCESS./apps/openssl/openssl speed -elapsed -evp chacha The 'numbers' are in 1000s of bytes per second processed. type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes chacha 229894.55k 374728.51k 401326.42k 407606.34k 410545.95k ^^^./apps/openssl/openssl speed -elapsed -evp aes-128-gcmThe 'numbers' are in 1000s of bytes per second processed.type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytesaes-128-gcm 578578.66k 1037298.77k 1496023.55k 1667607.21k 1729668.50k ^^^^./apps/openssl/openssl speed -elapsed -evp aes-256-gcm The 'numbers' are in 1000s of bytes per second processed. type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes aes-256-gcm 514792.29k 953548.57k 1340996.10k 1478150.01k 1520833.77k ^^^^./apps/openssl/openssl speed -elapsed -evp aes-128-cbc The 'numbers' are in 1000s of bytes per second processed. type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes aes-128-cbc 1070909.28k 1059120.83k 1084207.69k 1090894.01k 1078315.69k ^^^^./apps/openssl/openssl speed -elapsed -evp aes-256-cbc The 'numbers' are in 1000s of bytes per second processed. type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes aes-256-cbc 806110.46k 767273.81k 793146.46k 803538.08k 783499.41k ^^^cd /tmpwget zxvf openssl-1.1.1a.tar.gzcd openssl-1.1.1a./config && makecp /tmp/openssl-1.1.1a/ /usr/local/lib/cp /tmp/openssl-1.1.1a/ /usr/local/lib/./apps/openssl speed -elapsed -evp chacha20 The 'numbers' are in 1000s of bytes per second processed. type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes 16384 bytes chacha20 320078.35k 547365.25k 1287720.93k 2649847.21k 2762595.49k 2769084.88k ^^^^./apps/openssl speed -elapsed -evp aes-128-gcm The 'numbers' are in 1000s of bytes per second processed. type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes 16384 bytes aes-128-gcm 453159.25k 1215246.40k 2437021.95k 3909602.78k 4900248.28k 4996923.22k ^^^^./apps/openssl speed -elapsed -evp aes-256-gcm The 'numbers' are in 1000s of bytes per second processed. type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes 16384 bytes aes-256-gcm 397133.57k 1118061.03k 2050411.88k 3017616.18k 3554319.58k 3603072.56k ^^^^./apps/openssl speed -elapsed -evp aes-128-cbc The 'numbers' are in 1000s of bytes per second processed. type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes 16384 bytes aes-128-cbc 812677.93k 1037389.63k 1066182.04k 1068901.72k 1067816.15k 1074969.69k ^^^^./apps/openssl speed -elapsed -evp aes-256-cbc The 'numbers' are in 1000s of bytes per second processed. type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes 16384 bytes aes-256-cbc 720262.90k 757488.79k 775043.00k 776824.49k 780029.74k 792199.17kgit clone -GNinja -DCMAKE_BUILD_TYPE=Release .. && ninjacd build/tools./bssl speed...Did 544000 AES-128-GCM (8192 bytes) seal operations in 1000170us (543907.5 ops/sec): 4455.7 MB/sDid 412000 AES-256-GCM (8192 bytes) seal operations in 1001476us (411392.8 ops/sec): 3370.1 MB/sDid 215000 ChaCha20-Poly1305 (8192 bytes) seal operations in 1000321us (214931.0 ops/sec): 1760.7 MB/s...Did 57000 AES-128-CBC-SHA1 (8192 bytes) seal operations in 1014216us (56201.0 ops/sec): 460.4 MB/sDid 50000 AES-256-CBC-SHA1 (8192 bytes) seal operations in 1018187us (49106.9 ops/sec): 402.3 MB/s

How can I test OpenSSL with AES-NI on and off from the command line?

Using the "OPENSSL_ia32cap" environmental variable you can force OpenSSLto disable AES-NI acceleration. The following two tests show AES-NI results offand then back on. Notice that without AES-NI, the aes-128-gcm cipher processeddata at 212 MB/sec. With AES-NI enabled the same aes-128-gcm cipher speedjumped to 1,357 MB/s ! A six(6) times performance boost.

# cpu example type: AMD FX 6100$ dmesg | grep -i cpu[ 0.277326] smpboot: CPU0: AMD FX(tm)-6100 Six-Core Processor (fam: 15, model: 01, stepping: 02)OpenSSL AES-NI = OFF $ OPENSSL_ia32cap="~0x200000200000000" openssl speed -elapsed -evp aes-128-gcmYou have chosen to measure elapsed time instead of user CPU time.Doing aes-128-gcm for 3s on 16 size blocks: 11810234 aes-128-gcm's in 3.00sDoing aes-128-gcm for 3s on 64 size blocks: 3458208 aes-128-gcm's in 3.00sDoing aes-128-gcm for 3s on 256 size blocks: 2269863 aes-128-gcm's in 3.00sDoing aes-128-gcm for 3s on 1024 size blocks: 612727 aes-128-gcm's in 3.00sDoing aes-128-gcm for 3s on 8192 size blocks: 77820 aes-128-gcm's in 3.00sThe 'numbers' are in 1000s of bytes per second processed.type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytesaes-128-gcm 62987.91k 73775.10k 193694.98k 209144.15k 212500.48k... the result is 212500.48k / 1000 = 212 MB/sOpenSSL AES-NI = ON$ openssl speed -elapsed -evp aes-128-gcmYou have chosen to measure elapsed time instead of user CPU time.Doing aes-128-gcm for 3s on 16 size blocks: 47814322 aes-128-gcm's in 3.00sDoing aes-128-gcm for 3s on 64 size blocks: 32192031 aes-128-gcm's in 3.00sDoing aes-128-gcm for 3s on 256 size blocks: 13198683 aes-128-gcm's in 3.00sDoing aes-128-gcm for 3s on 1024 size blocks: 3757898 aes-128-gcm's in 3.00sDoing aes-128-gcm for 3s on 8192 size blocks: 497117 aes-128-gcm's in 3.00sThe 'numbers' are in 1000s of bytes per second processed.type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytesaes-128-gcm 255009.72k 686763.33k 1126287.62k 1282695.85k 1357460.82k... the result is 1357460.82k / 1000 = 1,357 MB/s

How can I test a remote server cipher ?

Use the openssl s_clienttool and query a remote server. You can let the client and server choose themost preferred cipher or you can specify the exact cipher name you want to useduring the connection.

# Test using the client/server negotiated cipherecho -n | ./apps/openssl/openssl s_client -connect Test using the ChaCha cipherecho -n | ./apps/openssl/openssl s_client -cipher ECDHE-ECDSA-CHACHA20-POLY1305 -connect Test using the AES-128-GCM cipherecho -n | ./apps/openssl/openssl s_client -cipher ECDHE-ECDSA-AES128-GCM-SHA256 -connect
AES-NI SSL Performance Study @ (2024)
Top Articles
Bitcoin Halving: Understanding the impact on prices and market dynamics
The 4 pillars of digital transformation: A framework for success | AppCreator Blogs | ManageEngine AppCreator
Katie Pavlich Bikini Photos
Gamevault Agent
Hocus Pocus Showtimes Near Harkins Theatres Yuma Palms 14
Free Atm For Emerald Card Near Me
Craigslist Mexico Cancun
Hendersonville (Tennessee) – Travel guide at Wikivoyage
Doby's Funeral Home Obituaries
Vardis Olive Garden (Georgioupolis, Kreta) ✈️ inkl. Flug buchen
Select Truck Greensboro
Things To Do In Atlanta Tomorrow Night
How To Cut Eelgrass Grounded
Pac Man Deviantart
Alexander Funeral Home Gallatin Obituaries
Craigslist In Flagstaff
Shasta County Most Wanted 2022
Energy Healing Conference Utah
Testberichte zu E-Bikes & Fahrrädern von PROPHETE.
Aaa Saugus Ma Appointment
Geometry Review Quiz 5 Answer Key
Walgreens Alma School And Dynamite
Bible Gateway passage: Revelation 3 - New Living Translation
Yisd Home Access Center
Shadbase Get Out Of Jail
Gina Wilson Angle Addition Postulate
Celina Powell Lil Meech Video: A Controversial Encounter Shakes Social Media - Video Reddit Trend
Walmart Pharmacy Near Me Open
Dmv In Anoka
A Christmas Horse - Alison Senxation
Ou Football Brainiacs
Access a Shared Resource | Computing for Arts + Sciences
Pixel Combat Unblocked
Cvs Sport Physicals
Mercedes W204 Belt Diagram
Rogold Extension
'Conan Exiles' 3.0 Guide: How To Unlock Spells And Sorcery
Weekly Math Review Q4 3
Facebook Marketplace Marrero La Reddit
Topos De Bolos Engraçados
Gregory (Five Nights at Freddy's)
Grand Valley State University Library Hours
Holzer Athena Portal
Hampton In And Suites Near Me
Stoughton Commuter Rail Schedule
Bedbathandbeyond Flemington Nj
Free Carnival-themed Google Slides & PowerPoint templates
Otter Bustr
Selly Medaline
Latest Posts
Article information

Author: Manual Maggio

Last Updated:

Views: 5608

Rating: 4.9 / 5 (69 voted)

Reviews: 92% of readers found this page helpful

Author information

Name: Manual Maggio

Birthday: 1998-01-20

Address: 359 Kelvin Stream, Lake Eldonview, MT 33517-1242

Phone: +577037762465

Job: Product Hospitality Supervisor

Hobby: Gardening, Web surfing, Video gaming, Amateur radio, Flag Football, Reading, Table tennis

Introduction: My name is Manual Maggio, I am a thankful, tender, adventurous, delightful, fantastic, proud, graceful person who loves writing and wants to share my knowledge and understanding with you.