The Block Cipher Square JoanDaemen 1 LarsKnudsen 2 VincentRijmen ?2 1 Banksys Haachtesteenweg1442 B-1130Brussel,Belgium Daemen.J@banksys.be 2

KatholiekeUniversiteit Leuven,ESAT-COSIC K.Mercierlaan94,B-3001Heverlee,Belgium



Abstract. Inthis pap er we presentanew 128-bit blo ckciphercalled

Square.Theoriginal designofSquareconcentratesontheresistance against di erential and linear cryptanalysis. However, after the initial design adedicatedattackwas mountedthat forced usto augment the numb erofrounds.Thegoalofthispap eristhepublicationofthe result-ingcipherforpublicscrutiny.ACimplementationofSquareisavailable thatrunsat2.63MByte/sona100MHzPentium.OurM68HC05Smart Cardimplementation tsin547bytesandtakeslessthan2msec.(4MHz Clo ck).Thehighdegreeofparallellismallowshardwareimplementations intheGbit/srangeto day.

1 Intro duction

Inthis pap er, we prop osethe blo ckcipherSquare. Ithas ablo cklengthand keylengthof128bits.However,itsmo dulardesignapproachallowsextensions to higher blo ck lengths in a straightforward way. The cipher has a new self-recipro calstructure, similartothatofThreewayandSHARK[2,15].

The structure of the cipher, i.e., the typ es of building blo cksand their in-teraction, hasb een carefully chosento allow veryecient implementations on a wide range of pro cessors. The sp eci c choice of the building blo cks them-selveshasb eenledbytheresistanceofthecipheragainstdi erentialandlinear cryptanalysis.Aftertreatingthestructureofthecipheranditsconsequencesfor implementations,weexplainthestrategiesfollowedtothwart linearand di er-entialcryptanalysis.Thisisfollowedbyadescriptionofanecientattackthat exploitstheparticularprop ertiesofthecipherstructure.

We do not encourage anyoneto use Square to day in any sensitive appli-cation. Clearly, con dence in the security of any cryptographic design must b e based on the resistance against e ective cryptanalysis after intense public scrutiny.


F.W.Oresearchassistant,sp onsoredbytheFundforScienti cResearch{Flanders (Belgium).


AreferenceimplementationofSquareisavailablefromthefollowingURL: http://www.esat.kuleuven.ac.b e/rijmen/square.

2 Structure of Square

Squareisaniteratedblo ckcipherwithablo cklengthandakeylengthof128 bits each. The round transformation of Square is comp osed of four distinct transformations.Itishoweverimp ortanttonotethatthesefourbuildingblo cks canb eecientlycombinedinasingleset oftable-lo okups andexorop erations. Thiswillb etreatedlaterinthesectiononimplementationasp ects.

Thebasicbuildingblo cks ofthecipherare vedi erent invertible transfor-mationsthatop erateona44arrayofbytes.Theelementofastate

ain row


j issp eci edas


2.1 ALinear Transformation

 is alinear op eration that op erates separately on each of the four rowsof a state.Wehave : b= ( a), bi;j= cjai;0cj 1ai;1  cj 2ai;2  cj 3ai;3;

wherethemultiplication isin GF (2


)and theindicesof

c mustb e takenmo d-ulo4.Note that the eld GF(2


)has characteristic2[9]. Thismeans thatthe additionin the eldcorresp ondstothebitwiseexor.

The rows of a state can b e denoted by p olynomials, i.e., the p olynomial corresp ondingtorow










Usingthisnotation,andde ning

c( x)=


j cjx j wecandescrib e asamo dular p olynomialmultiplication: b= ( a), bi( x)= c( x) ai( x)mo d1 x 4 for 0 i <4 : Theinverseof

corresp ondstoap olynomial

d( x)givenby d( x) c( x)=1 (mo d 1 x 4 ) : 2.2 ANonlinear Transformation

isanonlinearbytesubstitution, identicalforallbytes.Wehave

: b= ( a), bi;j=S ( ai;j) ;

withS aninvertible8-bitsubstitutiontableorS-b ox.Theinverseof

consists oftheapplicationof theinversesubstitutionS



2.3 A BytePermutation 

Thee ect of

 istheinterchangingofrowsandcolumns ofastate.Wehave

 : b= ( a), bi;j = aj;i:  isaninvolution,hence  1 = .

2.4 Bitwise Round Key Addition [



]consistsofthebitwise additionofaroundkey

k t .Wehave [ k t ]: b= [ k t ]( a), b= ak t : Theinverseof [ k t ]is [ k t ]itself.

2.5 The Round Key Evolution

Theround keys



arederivedfrom the cipherkey

K in the following way.




K.Theotherroundkeysarederivediterativelybymeans oftheinvertibleanetransformation .

: k t = ( k t 1 )

2.6 The Cipher Square

The building blo cks are comp osed into the round transformation denoted by

[ k t ]: [ k t ]= [ k t ]   (1)

Squareisde nedaseightroundspreceededbyakeyaddition

[ k 0 ]andby  1 : Square[ k]= [ k 8 ] [ k 7 ] [ k 6 ] [ k 5 ] [ k 4 ] [ k 3 ] [ k 2 ] [ k 1 ] [ k 0 ]  1 (2)

2.7 The InverseCipher

Aswill b e shown in Section 9,thestructure ofSquare lends itself toecient implementations.Foranumb erofmo desof op erationitis imp ortantthatthis isalsothecasefortheinversecipher.Therefore,Square hasb eendesignedin suchawaythat thestructureof itsinverseis equaltothat ofthecipheritself, withtheexceptionofthekeyschedule.Notethatthisidentityinstructuredi ers fromtheidentityofcomponents andstructurein IDEA[10].

From(2)itcan b eseenthat

Square 1 [ k]=  1 [ k 0 ]  1 [ k 1 ]  1 [ k 2 ]  1 [ k 3 ]  1 [ k 4 ]  1 [ k 5 ]  1 [ k 6 ]  1 [ k 7 ]  1 [ k 8 ]


- a S[a] S[] S[] S[] S[] S[] S[] S[] S[] S[] S[] S[] S[] S[] S[] S[] - a b c d a b c d -

Fig. 1.Geometrical representationof the basic op erations of Square.  consists of 4 parallel linear di usion mappings. consists of 16 separate substitutions.  is a transp osition.


with  1 [ k t ]=  1 1  1  1 [ k t ]=  1 1 [ k t ] (3)

It mayseem that the structure of theinverse cipher di erssubstantially from thatofthecipheritself. Byexploitingsomealgebraicprop ertiesofthebuilding blo cks,wecan show this notto b ethe case.Since

 only transp osesthebytes

ai;j and

only op eratesontheindividual bytes, indep endentof theirp osition ( i; j),wehave 1 =  1 : Moreover,since  1 ( a) k t =  1 ( a+ ( k t )),wehave [ k t ]  1 =  1 [ ( k t )] ;

Wenowde netheroundtransformationoftheinversecipheras

 0 [ k t ]= [ k t ]  1  1 ; (4)

which hasthe same structure as

 itself, except that

and  are replaced by 1 and  1

resp ectively.Usingthealgebraicprop ertiesab ove,wecanderive

[ k 0 ]  1 [ k 1 ]= [ k 0 ]  1 1 [ k 1 ] =  1 [ ( k 0 )]  1 [ k 1 ] = [ ( k 0 )]  1 [ k 1 ] = [ ( k 0 )]  1 [ k 1 ]  1  = [ ( k 0 )]  1  1 [ ( k 1 )]  =  0 [ ( k 0 )] [ ( k 1 )] 

Thisequationcanb egeneralizedinastraightforwardwaytoinclude morethan oneround. Now,with

 t = ( k 8 t ),we have Square 1 =  0 [  8 ]  0 [  7 ]  0 [  6 ]  0 [  5 ]  0 [  4 ]  0 [  3 ]  0 [  2 ]  0 [  1 ] [  0 ] 

Hence the inversecipher is equal to the cipher itself with

replaced by 1 , with by  1

anddi erentround keyvalues.

2.8 First round The  1 b efore [ k 0

]inSquarecanb eincorp oratedinthe rstround.Wehave

[ k 1 ] [ k 0 ]  1 = [ k 1 ]  [ k 0 ]  1 = [ k 1 ]  [ ( k 0 )] Hence theinitial


can b e discardedbyomitting

 in the rstroundand applying ( k 0 ) instead of k 0

. The same simpli cation can b e applied to the inversecipher.


3 Linear and Di erential Cryptanalysis

Theresistanceagainstlinearcryptanalysis[12]anddi erentialcryptanalysis[1] hasb eentherationaleb ehindthecriteriabywhichtheS substitutionandthe

multiplicationp olynomial


x)haveb eenchosen.

Adi erencepropagationalongtheroundsofaniteratedblo ckcipheris gen-erallycalledadi erentialcharacteristic. Acharacteristicissp eci edbyaseries of di erence patterns. The probability asso ciated with a characteristic is the probabilitythat all intermediatedi erencepatterns havethevaluesp eci ed in theab oveseries.Wecalladi erentialcharacteristicadi erentialtrail.The prob-abilityasso ciated withadi erential trailcanb e approximatedby thepro duct ofthe di erencepropagationsb etween everypairof subsequentrounds (which canb e easilycalculated). Theprobabilitythat agiven di erencepattern



at theinputofanumb erofcipherroundsgivesrisetoadi erencepattern



atthe outputis equalto the sumof theprobabilities of alldi erential trails starting with






.Ingeneralthepropagationfromtheinputdi erence pattern



totheoutputdi erencepattern



is calledadi erential.

Ascanb e seenin[12] thecorrelationb etweenalinearcombination ofinput bitsand alinearcombination ofoutput bitsof aniteratedblo ckciphercanb e treatedin ananalogous butslightlydi erent way. Alinear trailis sp eci ed by aseriesof selectionpatterns. Foragiven cipherkey, thecorrelation coecient

(p ositive or negative) corresp onding to a linear trail consists of the pro duct of thecorrelation co ecients b etween the linearcombinations of bits of every pairof subsequentrounds. In[2] it was shown that the correlation b etween a linearcombination of input bits, denoted by selection pattern

u, and a linear combinationofoutputbits,denoted by

v isequaltothesumof thecorrelation co ecientsofalllineartrailsstartingwith


v.Itmustb eremarked that the correlationco ecients mayb e p ositive ornegative and that thesign dep endsonthevalueofround keybits.

S and


x)arechosentominimizethemaximumprobabilityof di erential trails and the maximum correlation of linear trails over four rounds. This is obtainedintheframeworkofaverysp eci capproach.

3.1 WideTrail DesignStrategy

In[2] the `wide traildesign strategy' wasintro duced asa means to guarantee lowmaximumprobabilityofmultiple-rounddi erentialtrailsandlowmaximum correlationof multiple-round linear trails.In this strategy the round transfor-mationis comp osed of anumb er of uniform transformations, that are split in thenonlinear blo ckwise substitution(corresp onding to our

) and thecomp o-sition of the linear transformations (corresp onding to our

 ). The round keyaddition do es notplayarole in thestrategy. It wasshown in [2]that the probabilityof a di erential trail is the pro duct of the input-output di erence propagationprobabilities of the S-b oxeswith nonzero inputdi erence (`active S-b oxes'). The correlation of a linear trailis the pro duct of the input-output


correlations of the S-b oxes with nonzero output selection patterns (`active S-b oxes').Thetwo mechanismsfor eliminatinghigh-probabilitydi erentialtrails andhigh-correlationlineartrails arethefollowing:

{ Cho oseanS-b oxwherethemaximumdi erencepropagationprobabilityand themaximuminput-outputcorrelationareassmall asp ossible.

{ Cho osethelinearpartinsuchawaythattherearenotrailswithfewactive S-b oxes.

The rstmechanism givesus two clearcriteriafortheselection oftheS . The second mechanismgives ahint onhow to selectthe multiplicationp olynomial


x). Inthefollowingsectionwewillfo cusonthelinearpart


4 The Multiplication Polynomial c(x)


treatsthedi erentrowsofastate

acompletelyseparately andinthesameway.Wewillnowstudythedi erencepropagationand correla-tionprop ertiesof

, concentratingonasinglerow.Assume aninput di erence sp eci edby a 0 ( x)= a( x) a  (

x).Theoutputdi erencewillb e givenby

b 0 ( x)= c( x) a( x) c( x) a  ( x)mo d1 x 4 = c( x) a 0 ( x)mo d1 x 4 :

Ontheotherhand,alinearcombinationofoutputbits,sp eci edbytheselection p olynomial


x)is equalto(i.e., correlatedto, withcorrelationco ecient1) a linearcombinationofinputbits,sp eci edbythefollowingselectionp olynomial [2]: v( x)= c( x 1 ) u( x)mo d1+ x 4 :

Itisintuitivelyclearthatb othlinearanddi erentialtrailswouldb ene tfroma multiplicationp olynomialthatcouldlimitthenumb erofnonzerotermsininput andoutputdi erence(andselection)p olynomials.Thisisexactlywhatwewant toavoidbycho osingap olynomialwithahighdi usionp ower,expressedbythe so-calledbranch number.



a)denotetheHammingweightofavector,i.e.,thenumb erofnonzero comp onents in that vector. Applied to a state

a, a di erence pattern



or a selectionpattern

u,thiscorresp ondstothenumb erofnon-zerobytes.In[2]the branchnumb erBofaninvertiblelinearmappingwasintro ducedas

B( )=min a6=0 ( wh( a)+ wh( ( a))) :

ThisimpliesthatthesumoftheHammingweightsofapairofinputandoutput di erencepatterns(orselectionpatterns)to

isatleastB.Itcaneasilyb eshown that B is a lower b ound for the numb er of active S-b oxes in two consecutive roundsof alinearordi erential trail.Since

 op eratesoneachrow separately, wecanhaveB=5at most.

In [15] it was shown how a linear mapping over GF(2




with optimal B (B=


MDS-co de used is a Reed-Solomon co de over GF(2


): if

Ge = [

InnBnn] is theechelon form of the generation matrixof and (2

n; n; n+1)-RS-co de, then


X 7!Y =

BX de nesalinearmappingwithoptimalbranchnumb er. Thep olynomial multiplication with


x)corresp onds to asp ecial subsetof the MDS-co des, having the additional prop erty that

B is a circulant matrix. A circulantmatrixis amatrix whereevery rowconsists ofthe sameelements, shifted over one p osition, or

bi;j =

b0;j imo dn

. This prop erty is exploited in section9.2to pro duce amemory-ecient implementationofthe cipher.In[11] we ndthefollowingtheorem:

Theorem1. An (

n; k; d)-code C with generator matrix G = [

I B] is MDS i everysquaresubmatrixof

B is nonsingular.

Inamatrixwithelements fromGF (2


)everydeterminant hasaprobabilityof 2


toevaluateto zero.Forincreasingsizeofthematrixthenumb erof deter-minantsincreasesexp onentially,makingitinfeasibleto searchrandomlyfor an MDS-co de.However,in acirculantmatrixthenumb erofdistinctdeterminants isonlyafractionofthenumb er forarbitrarymatrices(cf. Table1). Byimp os-ingtheextraconstraintthat thematrix shouldb e acirculant, weincreasethe probabilityto ndanMDS-co debyrandomsearch.

ngenericcirculant ngenericcirculant

1 1 1 5 252 41

2 5 3 6 924 111

3 20 7 7 3431 309

4 70 17 8 12869 935

Table 1.Thenumb erof squaresubmatricesinagenericmatrixofordern, andthe numb erofnon-equivalentdeterminants ina circulantmatrixof thesame order.The numb ersofthelastcolumnwereobtainedbyanexhaustivecomputersearch.


x) corresp onds to a 44 matrix, hence if we cho ose it randomly, the probabilitythat ithasB=5canb e approximatedby(1

1 256 ) 17  0 :93.This givesusahighdegreeoffreedom inthechoiceof

c( x).Wecho ose c( x)=2 x 1 x  x1x  x 2 3x  x 3 : Thisdetermines d( x)uniquely. d( x)=E x 9 x  xDx  x 2 B x  x 3

4.1 Motivation forthe Choice of 

Since the branchnumb er of


x) is 5, thenumb er of active S-b oxes in a two-round trailis at least 5.The e ect of


thee ectthatanytrailoverfourconsecutiveroundswillhaveatleast25active S-b oxes.A simpleand clearpro ofof thisisavailableandwill b epublishedin a moretheoreticalpap erthatisb eing written[3].

5 The Nonlinear Substitution

As explained ab ove, the relevant criteria imp osed up on the

S-b ox are the highest(inabsolutevalue)o ccurringcorrelationb etweenanypairoflinear com-bination of input bits and linear combinations of output bits (denoted by

) andthehighesto ccurring probabilitycorresp ondingto anypairofinput di er-enceandoutputdi erencepattern.Thiscorresp ondstothehighestvalueinthe so-calledexortable ofthe

S-b ox,de nedas Eij=#f xjS( x) S( xi)= jg: Wede ne =maxi;jf Eijg 2 8 .

We present three alternative choices for the S-b ox: explicitly constructed nonlinearalgebraic transformations,slightlymo di edversionsofthelatterand randomlyselectedinvertiblemappings.

5.1 ExplicitConstruction

In [13] a metho d is given to construct

m-bit S-b oxes with

= 2

1 m=2



2 m

,thetheoreticallyminimump ossiblevalues.Fromtheprop osalsin[13] weselectthemapping

x7!x 1 overGF(2 8 ),with =2 6 and =2 3 . Theproblemwiththischoiceisthatthemappinghasaverysimple descrip-tioninGF(2


).Theothercomp onents oftheroundtransformationalsohavea simpledescription in GF(2


). This mayenable cryptanalyticattacksbased on thealgebraic manipulationofequationstoderivekeyinformation[4].


m-bitmappingcanb erepresentedasap olynomialorarational forminGF(2


).Itishoweverunlikelythatthisrepresentationcanb eexploited in cryptanalysis if the p olynomial or rational form is of no sp ecial, relatively simple,form.

Thefeasibilityofalgebraicmanipulationcanb eseverelydiminished.The ele-mentsofGF(2


)canb erepresentedwithresp ecttodi erentbases.Bycho osing a di erent basis for the de nition of


we can prevent that the round transformationhasasimpledescriptioninanybasisofGF (2



Still,evensp eci edinanotherbasis,thechosennonlinearmappingstaysan involutionandhastwo xedp oints:0and1.Byapplyingananetransformation ontheindividualbitsoftheoutputtheseprop ertiescanb eremovedandasimple algebraicexpressionoftheroundtransformationin anybasisofGF(2


)canb e prevented.


5.2 Mo di cations

Anothermetho dtopreventasimplealgebraicdescriptionisbycho osinga map-ping accordingto themetho d explained in the previoussubsection and subse-quentlymo difyingitslightlytodestroytheexploitablealgebraicstructure.Itwill b eseenthat thedisadvantageofthisapproachisthat


willincrease. We conducted some exp erimentsstarting from the mapping multiplicative inverseinGF(2


)as prop osedab ove(

=2 6 and =82 6 )andweapplied asmallnumb erofmo di cations.

Whenweconsiderthemappingasalo ok-uptableandinvestigateallvariants thathaveapairofentriesswapp ed,anincreaseisobservedof






.Wealsotested300000variantsinwhichfouroreightentrieswere swapp ed.Swapping four entriesincreases

to 92


, swapping eightentries increases to102 6 and to62 8 . 5.3 RandomSearch

Algebraicallyconstructedp ermutationsalwaysexhibitsomestructurethatmay b eexploitedin attacksin unanticipatedways,designersoftenresorttorandom substitutions:asubstitutionisselectedfromasetofsubstitutionsthatare gen-eratedbytheuseofarandomsourceandevaluatedwithresp ectto(presumably) relevant nonlinearity criteria. In[14] the averagedi erential prop ertiesof p er-mutationsareinvestigatedandab oundfortheexp ectedvalueof

isgiven.For an m-bit p ermutation lim m!1 E[ 2 m ] 2 m 1 :

We veri ed this exp erimentally for 1.5 million samples with

m = 8 and measuredatthesametime


.Theresultsaregivenintable2.TheS-b oxes with the highest resistance against b oth linear and di erential cryptanalysis, have =102 8 and =152 6 .   82 8 102 8 122 8 142 8 162 8 182 8 202 8 152 6 0 0.07 0.07 0.006 0.0001 0 0 162 6 0.0003 4.77 5.58 0.58 0.04 0.002 0 172 6 0.002 15.63 20.55 2.24 0.15 0.007 0.0004 182 6 0.000212.21 17.17 1.96 0.13 0.007 0.0005 192 6 0.0004 4.91 7.31 0.87 0.05 0.003 0 202 6 0 1.52 2.34 0.28 0.02 0.001 0 212 6 0 0.41 0.64 0.08 0.004 0.001 0

Table 2.Maximuminput-outputcorrelationanddi erencepropagationprobabilityof randomlygeneratednonlinearp ermutations.Theentriesdenotethep ercentageofthe generatedmappingsthathavetheindicatedand.


5.4 OurChoice

Because of its optimal values for


, we havedecided to take for S an S-b ox that is constructed by taking the mapping

x 7! x


and applying an anetransformation(overGF (2))totheoutputbits.Thisanetransformation has the prop erty that it has a complicated description in GF (2


) to thwart interp olationattacks[4].

Ourchoicesforceallfour-rounddi erentialtrailstohaveanasso ciated prob-abilitynothigherthan2


, farb elow thecriticalnoisevalueof2


. Equiv-alently,four-roundlineartrailshaveanasso ciatedcorrelationnotover2


,far b elowthecriticalnoisevalueof2


.Hence,forresistanceagainstconventional LCandDCsixroundsmayseemsucient.However,thesp eci cblo cked struc-tureof the cipher allowsfor more ecient dedicated di erentialattacks. This willb eexplainedin thefollowingsection.

6 A Dedicated Attack

Inthissectionwedescrib eadedicatedattackthatexploitsthecipherstructure of Square. Theattack is achosen plaintext attack andis indep endent of the sp eci cchoicesof

S ,


x)andthekeyschedule.Itisfasterthananexhaustive key search for Square versions of up to 6 rounds. After describing the basic attackon4rounds,wewillshowhowitcanb eextendedto 5and6rounds.

6.1 Preliminaries


-set b easet of256statesthatarealldi erentinsomeofthe(16)state bytes(the active)andall equalin theotherstatebytes (the passive). Let

b e theset ofindicesoftheactivebytes.Wehave

8 x; y2:

xi;j6= yi;j for( i; j)2  xi;j= yi;j for( i; j)62 

Inthissectionwewillmakeuseofthegeometricalinterpretationaspresented in Figure 1. Applying the transformations





] on (the elements of ) a

-setresultsina(generallydi erent)



 results ina

-setin whichtheactivebytesaretransp osedby

. Applying


-set do esnotnecessarilyresultin a

-set.However,sinceeveryoutputbyte of

is alinearcombination(withinvertibleco ecients) ofthefourinputbytesin the samerow, an input row with a singleactive byte gives rise to anoutput row withonlyactivebytes.

6.2 Four Rounds


-setinwhichonlyonebyteisactive.Wewillnowtracetheevolution ofthep ositionsoftheactivebytesthrough3rounds.The1stroundcontainsno

, hencethereisstillonlyonebyteactiveattheb eginningofthe2ndround.


2ndroundconvertsthisto acompleterowofactive bytes,that issubsequently transformedby

 to a completecolumn.

 of the3rd roundconverts this to a

-setwithonlyactivebytes.Thisisstillthecaseattheinputtothe4thround. Sincethebytesoftheoutputsofthe3rdround(denotedby

a)rangeoverall p ossiblevaluesandarethereforebalancedoverthe



b= (a);a2 bi;j=




k cj kai;k=


l cl


a2 ai;l+j =


l cl0=0 :

Hence, the bytes of the output of

 of the fourth round are balanced. This balancednessisingeneraldestroyedbythesubsequentapplication of

. An outputbyteofthe4thround(denoted by

ahere)canb eexpressedasa functionoftheintermediatestate

b ab ove ai;j = S [ bj;i] k 4 i;j:






-setcanb e calculatedfrom theciphertexts. Ifthevaluesofthisbytearenotbalancedover

,theassumedvalueforthekeybyte waswrong.Thisisexp ectedto eliminate allbut approximately1key value.This canb erep eated for theother bytesof




Weimplemented theattackand foundthat two

-setsof 256chosen plain-textseacharesucienttouniquelydeterminethecipherkeywithan overwhelm-ingprobabilityofsuccess.

6.3 Extension by a Roundat the End

If an additional round is added, we have to calculate the ab ove value of


fromtheoutputofthe5throundinsteadofthe4thround.Thiscanb edoneby additionallyassumingavalueforasetof4bytesofthe5throundkey.Asinthe caseofthe 4-roundattack,wrong key assumptionsare eliminatedby verifying that

bj;i isnotbalanced. Inthis5-roundattack2


keyvaluesmustb e checked,andthismustb e re-p eated4times.Sincebychecking asingle


=256ofthewrong keyassumptionsasp ossiblecandidates, thecipherkeycanb efoundwith over-whelmingprobabilitywithonly5


6.4 Extension by a Roundat the Beginning

Thebasicideaistocho oseasetofplaintextsthatresultsina

-setattheoutput of the 2nd round with a single active S-b ox. This requires the assumption of valuesoffourbytesoftheroundkey



. If the intermediate state after

 of the 2nd round hasonly a single active byte, this is also the case for the output of the 2nd round. This imp oses the followingconditionsonarowoffour inputbytesof

 of thesecondround: one particular linear combination of these bytes must range over all 256 p ossible values(active)while3otherparticularlinearcombinationsmustb econstantfor


all256 states.This imp oses identical conditionson thebytes in thesame row in the input to




], and consequently on a column of bytes in the input to

 of the1st round. Ifthe corresp ondingcolumn of bytes of



is known,these conditionscan b econvertedtoconditionsonfourplaintextbytes.

Nowwe consideraset of2


plaintexts, suchthatthearrayofbytesin one columnrangesoverallp ossible valuesandallotherbytesareconstant.

Now,makeanassumptionforthevalueofthe4bytesoftherelevantcolumn of



.Selectfrom thesetof2


availableplaintexts,asetof256plaintextsthat ob eytheconditionsindicatedab ove.Nowthe4-roundattackcanb ep erformed. Forthegivenkeyassumption,theattackcanb erep eatedforaseveralplaintext sets. Ifthebyte valuesof



suggestedby theseattacksare notconsistent,the initialassumptionmusthaveb eenwrong.Acorrectassumptionforthebytesof



willresultintheswiftandconsistentrecup erationofthelastround key. Weimplemented this attackwhere weassumed knowledge of16 bitsof the rst-round key. The attack found the other 16 bits of the rst-round key and 128bitsofthelast-roundkeyusingonly2structuresof256plaintextsforevery keyvalueguessedin the rstround.

6.5 Complexityof the Attacks

Combiningb othextensionsresultsina6roundattack.Althoughinfeasiblewith currenttechnology,thisattackisfasterthanexhaustivekeysearch,andtherefore relevant. Wehavenotfound extensionsto 7roundsfaster than exhaustivekey search.

Wesummarizetheattacksin Table3.

Attack #Plaintexts Time Memory

4-round 2 9 2 9 small 5-roundtyp e1 2 11 2 40 small 5-roundtyp e2 2 32 2 40 2 32 6-round 2 32 2 72 2 32

Table 3.ComplexitiesoftheattackonSQUARE.

7 Numb er of Rounds

Duetotheseattackswehavetoincreasethenumb erofroundstoatleastseven. Asasafetymargin,we xedthenumb erofroundsto eight.

Conservative users are free to increase the numb er of rounds. This canb e donein astraightforwardway and requires no adaptationof the key schedule whatso ever.


8 The Key Evolution

Thekeyschedulesp eci esthederivationoftheroundkeysintermsofthecipher key.Its functionistoprovide resistanceagainstthefollowingtyp esofattack:

{ Attacksin which partofthe cipherkey isknown to thecryptanalyst, e.g., ifthecipherisusedwithakeyshorterthan128bits.

{ Attackswherethekeyentrytothecipherisknownorcanb echosen,e.g.,if thecipherisusedasthecompressionfunctionofahashalgorithm[7].

{ Related-keyattacks.

Resistanceagainstthe rsttyp eofattackcanb eimprovedbyakeyschedulein whichtheroundkeyundergo esatransformationwithhighdi usion.Forago o d scheme, theknowledgeof acertainnumb er of bits of oneroundkey xesvery fewbitsinotherroundkeys.Theothertwotyp esofattackexploitregularitiesin thestructureofthekeyschedulebylo callycomp ensatingroundkeydi erences [5,7].

Thekeyschedulealsoplaysanimp ortantroleintheeliminationofsymmetry:

{ Symmetry in the round transformation: the round transformation treatsallbytesofastateinverymuchthesameway.Thissymmetrycanb e removedbyhavinground constantsinthekeyschedule.

{ Symmetry b etween the rounds: theround transformationis the same for all rounds. This equality can b e removed by having round-dep endent roundconstantsin thekeyschedule.

Thekeyscheduleisde ned intermsoftherowsofthekey.Wecande nea leftbyte-rotationop erationrotl(




ai;1ai;2ai;3ai;0] andarightbyterotationrotr(

ai)asitsinverse. Thekey scheduleiterationtransformation



= (



) and itsinverseare de nedby k t+1 0 = k t 0 rotl( k t 3 ) Ct k t+1 1 = k t 1  k t+1 0 k t+1 2 = k t 2  k t+1 1 k t+1 3 = k t 3  k t+1 2  t+1 3 =  t 3   t 2  t+1 2 =  t 2   t 1  t+1 1 =  t 1   t 0  t+1 0 =  t 0 rotr(  t 3 ) C 0 t

The simplicity of the inverse key scheduleis thanks to the fact that

 and commute.Theroundconstants

Ctarealsode nediteratively.Wehave

C0 =1 x and Ct=2 x  Ct 1 .

Thischoiceprovideshighdi usionandremovestheregularitiesinanecient way.


9 Implementation Asp ects 9.1 8-bit Pro cessor

Onan8-bitpro cessorSquarecanb eprogrammedbysimplyimplementingthe di erent comp onent transformations. This is straightforward for


 and . The transformation

requires a table of 256 bytes.

 requires multiplication in the eld GF (2


). However, the multiplication p olynomial has b een chosen to makethis very ecient. Wehavewritten aprogram implementing Square in Assembler for the Motorola's M68HC05 micropro cessor, typical for Smart Cards.The machineco deo ccupies in total 547bytes of ROM, needs 36bytes ofRAMandoneexecutionofSquare,includingthekeyschedule,takesab out 7500cycles.Thiscorresp ondstolessthan2msecwith a4MHzClo ck.

The inverse cipher howeveris signi cantly slower than theforward cipher. Thisiscausedbythedi erencein complexityb etween




9.2 32-bit Pro cessor


[ k t ]  = [ k 0t ]  with k 0t = ( k t

)can b ecombinedin asinglesetoftable lo okups.The interme-diatestatecan b erepresentedbyfour32-bit words,eachcontainingarow[

ai]. Itstransp oseisdenotedby[

ai] T .For b= ( ( ( a)))+ k 0t wehave [ bi] T =



c0c3c2c1 c1c0c3c2 c2c1c0c3 c3c2c1c0





S [ a0;i ] S [ a1;i ] S [ a2;i ] S [ a3;i ]



[ k 0t i] T =



c0 c1 c2 c3



 S [ a0;i ]



c3 c0 c1 c2



 S [ a1;i ]



c2 c3 c0 c1



 S [ a2;i ]



c1 c2 c3 c0



 S [ a3;i ][ k 0t i] T

Wede ne thetables

M and T as M[ a]= a


T[ a]= M[ S[ a]] : T and

M have 256 entries of four bytes each. The table

M implements the p olynomialmultiplication.

T combinesthenonlinearsubstitutionwiththis mul-tiplication.Nowwehave

[ bi]=


j rotrj( T[ aji])[ k 0t i] : We conclude that [ ( k t i)]

 can b e done with 16 table lo okups, 12 rotationsand16exorsof32-bitwords. Thisimplementationneedsthetable

T, with256entriesoffourbytes, i.e.onekilobyteintotal.


Last Round It can b e seenthat in this implementation,

 of the last round is alreadyexecuted in theprevious set of table-lo okups. Inthe last round the functiontob eappliedis





 .Thiscanb erealisedbyreplacingthetable

T[ x]= M[ S[ x]] by S[ x]. Since c2 =1 x , theunity in GF(2 8

),theentries ofthe smalltable

S canb e extractedfrom

T,removingtheextrastoragerequirement for


Performance Thereference implementationis written in C and runs at 2.63 MByte/sona100MHzPentiumwiththeWindows95op eratingsystem.The in-verseciphercanb eimplementedinexactlythesamewayasthecipheritselfand hasthesamep erformance.Thedi erenceisinthetablesandtheprecalculation oftheroundkeys.

10 Acknowledgements


Square.PauloBarretocanb ereachedatpbarreto@uninet.com.br.


1. E.BihamandA.Shamir,\Di erentialcryptanalysisofDES-likecryptosystems,"


2. J.Daemen,\Cipherand hashfunctiondesignstrategiesbasedonlinearand dif-ferentialcryptanalysis,"DoctoralDissertation,March1995,K.U.Leuven.

3. J.DaemenandV.Rijmen,\Self-recipro calcipherstructures,"COSICinternal re-port96-3,1996.

4. T.JakobsenandL.R.Knudsen,\Theinterp olationattackonblo ckciphers,"these proceedings.

5. J.Kelsey,B.Schneier and D.Wagner, \Key-schedulecryptanalysis of IDEA, G-DES, GOST, SAFER, and Triple-DES," Advances in Cryptology, Proceedings Crypto’96,LNCS1109,N.Koblitz,Ed.,Springer-Verlag,1996,pp.237{252. 6. L.R.Knudsen,\Truncatedandhigherorderdi erentials," FastSoftware

Encryp-tion,LNCS1008,B.Preneel, Ed.,Springer-Verlag,1995,pp.196{211.

7. L.R.Knudsen, \A key-schedule weakness in SAFER-K64," Advances in Cryp-tology, Proceedings Crypto’95,LNCS963,D.Copp ersmith, Ed.,Springer-Verlag, 1995,pp.274{286.

8. L.R.KnudsenandT.A.Berson, \Truncated di erentialsof SAFER," Fast Soft-wareEncryption,LNCS1039,D.Gollmann,Ed.,Springer-Verlag,1996,pp.15{26. 9. N.Koblitz, \A Course in Number Theory and Cryptography," Springer-Verlag,


10. X.Lai, J.L.Masseyand S.Murphy,\Markovciphersand di erential cryptanaly-sis," AdvancesinCryptology,ProceedingsEurocrypt’91,LNCS547,D.W.Davies, Ed.,Springer-Verlag,1991,pp.17{38.

11. F.J.MacWilliams,N.J.A.Sloane,\TheTheoryofError-CorrectingCodes," North-Holland,Amsterdam,1977.


12. M.Matsui, \Linear cryptanalysis metho d for DES cipher," Advances in Cryp-tology, Proceedings Eurocrypt’93, LNCS765, T.Helleseth, Ed., Springer-Verlag, 1994,pp.386{397.

13. K.Nyb erg, \Di erentially uniform mappings for cryptography," Advances in Cryptology, Proceedings Eurocrypt’93, LNCS765, T.Helleseth, Ed., Springer-Verlag,1994,pp.55{64.

14. L.O'Connor,\Onthedistributionofcharacteristicsinbijectivemappings," Jour-nalofCryptology,Vol.8,No.2,1995, pp.67{86..

15. V.Rijmen,J.Daemen etal., \The cipher SHARK," Fast Software Encryption, LNCS1039,D.Gollmann,Ed.,Springer-Verlag,1996,pp.99{112.



