hubertf's NetBSD Blog
Send interesting links to hubert at feyrer dot de!
 
[20100222] Google Summer of Code 2010 NetBSD swcryptX Project Suggestion (Updated #2)
Please see the update below before applying for this GSoC project!

I've been thinking of a neat-o project for this year's Google Summer of Code:

    Abstract: The goal of this project is to provide crypto acceleration by utilizing multiple CPU cores. The work is to extend the existing software-only "swcrypto" crypto driver and hook that up with NetBSD's OpenCrypto framework.

Overview of operation

The opencrypto(9) framework exists to coordinate hardware acceleration in NetBSD. Applications of the framework can be inside the kernel like the FAST_IPSEC IPsec implementation, or in userland like OpenSSL with the "cryptodev" engine. Crypto drivers can be realized in software or in hardware. Hardware drivers can be used to instruct e.g. the AMD Geode LX's AES block or a HIFN chip to perform cryptographic operations. Upon system startup, the crypto drivers at the opencrypto(9) framework, telling what operations they can perform. When an operation is required later, the framework will look which crypto device is currently not busy, and offload the operation to that device. Upon completion, the result is fed back to the application.

The following image illustrates the components and their interaction.

Limitations

Offloading the cryptographic requests involves some overhead. Data needs to be transferred to the hardware and back. On systems with a slow CPU, this overhead is relatively small compared to the operation speed of the CPU. On faster CPUs, the overhead becomes more of a burden, making the benefit of the crypto hardware negligible.

As examples, while a hifn(4) chip can provide worthwhile speedups on 500MHz and 1GHz CPUs, no performance win is experienced on a 2.4GHz CPU.

Proposal

The communication overhead involves data transfers over a PCI bus, which is of relatively low speed compared to today's modern CPUs. Preventing the data transfer is a worthwhile goal. In coordination with today's modern multi-core CPUs, using one or more CPUs solely for the purpose of crypto acceleration, a measurable improvement of crypto performance is expected. At the same time, no special hardware requirements beyond the CPU exist. This allows turning standard contemporary systems into fast crypto systems easily.

The following image illustrates the idea of interoperation between a CPU core that runs the kernel and application codes and three cores that are dedicated to crypto code.

Implementation Roadmap

This is where it gets fishy. ;) The existing opencrypto(4) framework probably needs to be make MP-aware at the same time, employing proper use of NetBSD's locking framework. (Already done) The existing swcrypto(4) needs to be adjusted for operation on multiple CPUs at the same time. A way to decide how many CPUs are dedicated to run swcrypto(4) instances. CPUs that run swcrypto(4) need to be taken out from the usual NetBSD CPU scheduling so that they are available exclusively for crypto.

Requirements

In no particular order:
  • Know how to build and install a kernel
  • Understanding of fine grained SMP and locking
  • How to use NetBSD's kernel threads, code-wise
  • How to interact with NetBSD's scheduler, code-wise
  • Tell the scheduler to pin a specific kernel thread to a specific CPU
  • Interaction between applications (IPsec, OpenSSL) with opencrypto(9), code-wise
  • Interaction of crypto providers with opencrypto(9), code-wise
  • Hardware! You won't be able to do this without at least two CPU cores in your machine. The more the better.
  • Benchmarking & a test setup for it

Project Applications

Please follow the NetBSD Project Application/Proposal HowTo if you're serious to work on this project.

If you have any questions let me know, public discussion should be led on the tech-crypto@ list.

Update: There was some discussion. In particular, my understanding of the interaction of the various layers as outlined above is not 100% accurate, and userland applications using opencrypto already seem to benefit from multiple kernel threads. In-kernel applications apparently do not, and before providing multiple crypto-servers in kernel (as suggested), work should probably done first to make sure such applications exist. Examples of this are IPsec (and the whole network stack), but also others like cgd (which AFAIU currently does not use opencrypto(9)).

[Tags: , , ]


Disclaimer: All opinion expressed here is purely my own. No responsibility is taken for anything.

Access count: 36040482
Copyright (c) Hubert Feyrer