Deleted Added
full compact
aes-ppc.pl (1.1.1.4) aes-ppc.pl (1.1.1.5)
1#!/usr/bin/env perl
1#! /usr/bin/env perl
2# Copyright 2007-2016 The OpenSSL Project Authors. All Rights Reserved.
3#
4# Licensed under the OpenSSL license (the "License"). You may not use
5# this file except in compliance with the License. You can obtain a copy
6# in the file LICENSE in the source distribution or at
7# https://www.openssl.org/source/license.html
2
8
9
3# ====================================================================
4# Written by Andy Polyakov <appro@fy.chalmers.se> for the OpenSSL
5# project. The module is, however, dual licensed under OpenSSL and
6# CRYPTOGAMS licenses depending on where you obtain it. For further
7# details see http://www.openssl.org/~appro/cryptogams/.
8# ====================================================================
9
10# Needs more work: key setup, CBC routine...
11#
12# ppc_AES_[en|de]crypt perform at 18 cycles per byte processed with
13# 128-bit key, which is ~40% better than 64-bit code generated by gcc
14# 4.0. But these are not the ones currently used! Their "compact"
15# counterparts are, for security reason. ppc_AES_encrypt_compact runs
16# at 1/2 of ppc_AES_encrypt speed, while ppc_AES_decrypt_compact -
17# at 1/3 of ppc_AES_decrypt.
18
19# February 2010
20#
21# Rescheduling instructions to favour Power6 pipeline gave 10%
10# ====================================================================
11# Written by Andy Polyakov <appro@fy.chalmers.se> for the OpenSSL
12# project. The module is, however, dual licensed under OpenSSL and
13# CRYPTOGAMS licenses depending on where you obtain it. For further
14# details see http://www.openssl.org/~appro/cryptogams/.
15# ====================================================================
16
17# Needs more work: key setup, CBC routine...
18#
19# ppc_AES_[en|de]crypt perform at 18 cycles per byte processed with
20# 128-bit key, which is ~40% better than 64-bit code generated by gcc
21# 4.0. But these are not the ones currently used! Their "compact"
22# counterparts are, for security reason. ppc_AES_encrypt_compact runs
23# at 1/2 of ppc_AES_encrypt speed, while ppc_AES_decrypt_compact -
24# at 1/3 of ppc_AES_decrypt.
25
26# February 2010
27#
28# Rescheduling instructions to favour Power6 pipeline gave 10%
22# performance improvement on the platfrom in question (and marginal
29# performance improvement on the platform in question (and marginal
23# improvement even on others). It should be noted that Power6 fails
24# to process byte in 18 cycles, only in 23, because it fails to issue
25# 4 load instructions in two cycles, only in 3. As result non-compact
26# block subroutines are 25% slower than one would expect. Compact
27# functions scale better, because they have pure computational part,
28# which scales perfectly with clock frequency. To be specific
29# ppc_AES_encrypt_compact operates at 42 cycles per byte, while
30# ppc_AES_decrypt_compact - at 55 (in 64-bit build).

--- 1422 unchanged lines hidden ---
30# improvement even on others). It should be noted that Power6 fails
31# to process byte in 18 cycles, only in 23, because it fails to issue
32# 4 load instructions in two cycles, only in 3. As result non-compact
33# block subroutines are 25% slower than one would expect. Compact
34# functions scale better, because they have pure computational part,
35# which scales perfectly with clock frequency. To be specific
36# ppc_AES_encrypt_compact operates at 42 cycles per byte, while
37# ppc_AES_decrypt_compact - at 55 (in 64-bit build).

--- 1422 unchanged lines hidden ---