A dynamic thompson sampling hyper-heuristic framework for learning activity planning in personalized learning

(1)

University of Groningen

A dynamic thompson sampling hyper-heuristic framework for learning activity planning in

personalized learning

Aslan, Ayse; Bakir, Ilke; Vis, Iris F. A.

Published in:

European Journal of Operational Research

DOI:

10.1016/j.ejor.2020.03.038

IMPORTANT NOTE: You are advised to consult the publisher's version (publisher's PDF) if you wish to cite from

it. Please check the document version below.

Document Version

Publisher's PDF, also known as Version of record

Publication date:

2020

Link to publication in University of Groningen/UMCG research database

Citation for published version (APA):

Aslan, A., Bakir, I., & Vis, I. F. A. (2020). A dynamic thompson sampling hyper-heuristic framework for

learning activity planning in personalized learning. European Journal of Operational Research, 286(2),

673-688. https://doi.org/10.1016/j.ejor.2020.03.038

Copyright

Other than for strictly personal use, it is not permitted to download or to forward/distribute the text or part of it without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license (like Creative Commons).

Take-down policy

If you believe that this document breaches copyright please contact us providing details, and we will remove access to the work immediately and investigate your claim.

Downloaded from the University of Groningen/UMCG research database (Pure): http://www.rug.nl/research/portal. For technical reasons the number of authors shown on this cover page is limited to 10 maximum.

(2)

ContentslistsavailableatScienceDirect

European

Journal

of

Operational

Research

journalhomepage:www.elsevier.com/locate/ejor

Innovative

Applications

of

O.R.

A

dynamic

thompson

sampling

hyper-heuristic

framework

for

learning

activity

planning

in

personalized

learning

Ayse

Aslan

∗

,

Ilke

Bakir

,

Iris

F.

A. Vis

Department of Operations, University of Groningen, Groningen 9747, AD, the Netherlands

a

r

t

i

c

l

e

i

n

f

o

Article history:

Received 8 September 2019 Accepted 11 March 2020 Available online 19 March 2020

Keywords:

Timetabling Hyper-heuristics

Dynamic thompson sampling Personalized learning OR in education

a

b

s

t

r

a

c

t

Personalizedlearningisemerginginschoolsas analternativetoone-size-fits-alleducation.Thisstudy introducesandexploresaweeklydemand-drivenflexiblelearningactivityplanningproblemofown-pace own-methodpersonalizedlearning. Theintroduced problemisacomputationallyintractable optimiza-tionprobleminvolvingmanydecisiondimensionsandalsomanysoftconstraints.Weproposebatchand decompositionmethods to generategood-quality initial solutions and adynamicThompson sampling basedhyper-heuristicframework,asalocalsearchmechanism,whichexploresthelargesolutionspace ofthisprobleminanintegrativeway.Thecharacteristicsofourtestinstancescomplywithaverage sec-ondaryschoolsintheNetherlandsandarebasedonexpertopinionsandsurveys.Theexperiments,which benchmarktheproposed heuristicsagainstGurobiMIP solveronsmallinstances,illustratethe compu-tationalchallengeofthisproblemnumerically.Accordingtoourexperiments,thebatch methodseems quickerandalsocanprovidebetterqualitysolutionsfortheinstancesinwhichresource levelsarenot scarce,whilethedecompositionmethodseemsmoresuitableinresourcescarcitysituations.Thedynamic Thompsonsamplingbasedonlinelearningheuristicselectionmechanismisshowntoprovidesignificant valuetotheperformance ofour hyper-heuristiclocalsearch. Wealsoprovidesomepracticalinsights; ourexperimentsnumericallydemonstratethealleviatingeffectsoflargeschoolsizesonthechallengeof satisfyinghigh-spreadlearningdemands.

1. Introduction

Education is shifting from traditional one-size-ﬁts-all models which offerstandardized learningpaths foreveryone ina certain group(e.g.,age,level)topersonalizedlearning(Bray&McClaskey, 2013; West-Burnham & Coates, 2005). Schools are implementing various personalized learning models in which students have the freedom to customize their learningpaths andlearn at their own pace with their own methods throughout the world (see

Eiken (2011), Prain et al. (2013) andKannan, van den Berg, and Kuo (2012), for examples from Europe, Australia, and U.S.A., re-spectively). In Europe, the Swedish kunskapsskolan personalized learning model, initiated in 2000 in four schools in Sweden, is now being implemented in more than 100 schools around the world (see http://www.kunskapsskolan.com/thekednetwork). Ac-cordingtoareportbytheEuropeanCommissionpublishedin2017

∗ _{Corresponding author.}

E-mail addresses: ayse.aslan@rug.nl (A. Aslan), i.bakir@rug.nl (I. Bakir), i.f.a.vis@rug.nl (I.F.A. Vis).

(see https://ec.europa.eu/epsc/publications/other-publications/10-trends-transforming-education-we-know-it_en),personalized lear-ningisanimportantstrategictrendtotransformeducation.

Contrary to one-size-ﬁts-all,students inpersonalized learning are not tiedto classes,instead their learning needs are regarded individually.In personalized learningtechnology plays an impor-tant role;online learningportals are often available forstudents sothattheycanalsolearnindependentlythroughself-study learn-ingactivitiesinschools.Thisgivesstudentsthefreedomtochoose theirlearningmethods.Inpersonalizedlearning, studentsare the directorsoftheirownlearningprocesses;theysettheirowngoals, withthesupportoflearningcoaches,andactivelydemandlearning activitiestoreachthem.Animportanttaskforschoolsistosatisfy students’learningdemands ontime byplanningrelevant in-class andself-study learning activities withthe utilizationof teachers andclassrooms.

In theseown-method own-pace personalized learningmodels inwhichmanystudentsmaydemandmanydifferentactivities at anytime, neither students nor teachers are tiedto fixed groups. Activity groups are formed flexibly each time by flexibly group-ingstudent demandsin activities andallocating suitableteachers andclassroomstoactivities. Forexample,a studentmaybe with

(3)

different groups of students and also with different teachers in learning activities. This study introduces the weekly flexible demand-driven learning activity planning problem of personalized learning in which schools plan learning activities flexibly each weekbasedonstudentdemands.Duetothelackoffixedgroups, this problem involves decisions on individual student demands andalso on school resources (i.e., teachers, classroomsand time blocks).

Our problem partially relates to the educational timetabling problems due to common elements such as students, teachers and classrooms, and also due to common constraints such as scheduling conﬂicts, availability and capacity constraints. Educa-tional timetabling problems are extensively studied (see Pillay, 2014;Pillay,2016;Schaerf,1999for reviews).Manyofthese prob-lemsarecomputationallyintractable,eitherNP-completeorNP -hard problems. Various approaches such as single-solution local searchmethods(Fonseca&Santos,2014),population-basedsearch methods (Beligiannis, Moschopoulos, Kaperonis, & Likothanas-sis, 2008; Santiago-Mozos, Salcedo-Sanz, DePrado-Cumplido, & Bousoño-Calzón,2005), hyper-heuristics(Ahmed,Ozcan, &Kheiri, 2015;Pillay &Banzhaf,2009), matheuristics(Dorneles,deAraújo, & Buriol, 2014), integer programming techniques (Fonseca, San-tos, Carrano, & Stidsen, 2017; Phillips, Waterer, Ehrgott, & Ryan, 2015) and graph-theoric approaches (Kannan et al., 2012) are proposed and tested for these problems. Although the studied problemsin theliterature aremostly concentrated ontraditional educationalmodels,thereareafewstudieswhichexplore student-centred planning and timetabling problems. In Santiago-Mozos etal.(2005),a student-preferencebasedcoursetimetabling prob-lem in a Spanish university is presented. Also, a more recent study(Kristiansen,Sørensen,&Stidsen,2011)presentsthe student-centricelective course planning problemof Danish high schools.

Kannanetal.(2012)hasproposedamulti-stagegraph-theoric ap-proachtotheschedulingproblemofagroupofpersonalized learn-ing schools in New York City. Implementations of personalized learningdifferintermsoftheir degreeoffreedomoffered to stu-dents.InSantiago-Mozosetal.(2005),Kristiansenetal.(2011)and

Kannanetal.(2012),studentsareonly giventhe freedomto cus-tomize their learning paths, their curricula, by providing prefer-encesover a set of courses atthe beginning of a semester ora year.Ourproblem, ontheother hand,considersmodels inwhich studentsarealsogiventhefreedomtoprogressattheirownpace withtheirownmethodsforeverylearningactivityoftheircourses anytime.

The studiededucational timetablingproblemsusuallyconsider traditionaleducationalmodelsandthereforestudytheassignment of predetermined events (e.g., course-class meetings in school timetabling(Pillay,2014))toavailabletimes.Themain distinguish-ing aspect ofour problemis that learningactivities are not pre-determined;theyareplannedbasedonstudentdemands.Namely, whichlearningactivities to plan andhow manysessions ofeach learningactivity toplanin aweek arealsodecisions tobe made inourproblem.Thus,weclassifythisproblemasademand-driven planningproblemratherthanatimetablingproblem.

The dynamic, flexible,anddemand-driven natureofthis plan-ningproblemprovidesopportunitiestolearnfromandcontribute to the state-of-the-art in the area of logistics; more specifically, warehouseorderpicking(deKoster,Le-Duc,&Roodbergen,2007), dynamic vehicle routing (Pillac, Gendreau, Gueret, & Medaglia, 2013),train routing andscheduling(Cordeau,Toth,& Vigo,1998), amongmanyothers.Forexample,modern warehouseorder pick-ingproblems facethe challenge ofdynamicorderarrivalsdueto the growthof e-commerce. Dynamically arriving orders must be batchedinpicklistsandthenpickedbyorderpickersasefficiently aspossible(deKosteretal., 2007).Methodologically,parallelscan be drawn between creatingpick lists inthis problem and

creat-ing activitygroupsby batchingstudentdemands inour problem. We get inspiration from and striveto provide insights for these complexanddynamiclogisticsproblems,whichcandirectly bene-ﬁtfromapproximatesystematicsolutionmethodssuchasoneswe proposeinthispaper.

As also argued in Kannan et al. (2012), due to the freedom offered to students, finding good solutions becomes a computa-tionalchallengeassolutionspacegrows.Therefore,thistaskisnot suitable forsimplesolutionstrategiessuch asrulesofthumb, in-stead, systematicplanning toolsare necessary to explore the so-lution space and find high-quality solutions. This paper aims to present an efficient systematic method for the flexible demand-driven planning problem. We demonstrate the applicability of our proposed methodin the context ofsecondary schoolsin the Netherlands.

The remainder of the paper is organized as follows.

Section 2 describes the weekly activity planning problem.

Section 3 presents the MILP formulation. Section 4 presents the proposed heuristic approaches. Section 5 gives the compu-tational experiments, which provide performance analysis of the proposedsolutionapproaches.Section6providespracticalinsights and decisionsupport for schools. Lastly, Section 7 concludes the paper.

2. Problemdescription

The weekly ﬂexible demand-driven learning activity planning problem relates to schools which implement own-pace own-methodpersonalized learningmodels.The learningactivity plan-ninginthesemodelsisinitiatedwhen studentsdemandlearning activitiesforthelessonunitsofthelearninggoalsoftheircourses.

Fig. 1a illustrates the mechanismof learningactivity planningin these models, as opposed to the mechanism of one-size-ﬁts-all modelswhichisdepictedinFig.1b.

The main distinguishing aspect of planning in personalized learning compared to traditional education is the lack of fixed groups(i.e., classes).Yearlyor semesterlyproduced weeklycyclic timetablesarenotappropriateinpersonalizedlearning,aslearner groupsare dynamicallyformedeachtimebasedonvarying learn-ingdemands.Inpersonalizedlearning,studentscanbegroupedin learningactivitiesflexiblyandteachersandclassroomscanbe al-locatedflexiblyaswell.Theplanningproblemoftheschoolshere istoweeklyplanin-classandself-studylearningactivitiesby flex-ibly composing activitygroupsandallocating resources tosatisfy students’learningdemands.

The formal description of the problem with notation (see

Table1)isasfollows.

Inpersonalizedlearningschools,asetofin-classactivitiesa∈A are offeredto a setofstudents s∈S. Eachin-classlearning activ-itya∈A isuniquely associatedto alessonl∈Lof alearninggoal g∈G of a course c∈C. At the end of every week, students pro-videtheirhighpriorityA1

s andlowpriorityA2s demandsforthe in-classactivitiesthattheywouldliketobeplannedforthefollowing week. Their school plansa numberof sessions of thedemanded in-classlearningactivitiesina setoftime blocksb∈Bofasetof schooldaysd∈Dforthenextweekbyflexiblyassigningteachers, classrooms and students to the sessions. All demands should be satisfied asmuch aspossible withthe efficientuse ofschool re-sources.However,ifpossible,thenumberofactivities oncoursec that are assigned to student s inday dshould not be exceeding a dailycourselimit ofCLforeach dayd∈D foreach coursec∈C andforeach student s∈S.Any in-class activitysessiontakes one time block. In personalized learning schools, self-study activities withavailable learningmaterials, such asonline learningportals, canstand asanalternative toin-classlearning. Therefore,usually

(4)

Fig. 1. Personalized learning versus traditional education.

Table 1 Input notation.

Sets

a ∈ A : set of in-class activities, s ∈ S : set of students, c ∈ C : set of courses,

g ∈ G : set of learning goals, l ∈ L : set of lessons, t ∈ T : set of teachers,

r ∈ R : set of classrooms, d ∈ D : set of weekdays, b ∈ B : set of time blocks Subsets

A1

s : set of in-class activities demanded with high priority by student s

A2

s : set of in-class activities demanded with low priority by student s

Ac : set of activities on course c

A1

c : set of in-class activities on course c which can be only taught by ﬁrst-level teachers

Aa : set of in-class activities that precede activity a

Ane : set of in-class activities on conventional courses

Ce : set of non-conventional courses

Re

c : set of classrooms of non-conventional course c

Rne : set of classrooms of conventional courses

Tbd : set of teachers available in block b of day d

Tc : set of teachers of course c

T1

c : set of ﬁrst-level teachers of course c

Bc : set of time blocks that are preferred by course c Parameters

CL : daily course limit of students

Ka : classroom capacity of in-class activity a

WT : weekly in-class assignment limit of teachers

SE : maximum number of students that a teacher can guide in self-study

alargeself-studyenvironmentisavailableinschoolsforself-study learningactivities.

In-classactivitysessionstakeplaceinasetofclassroomsr∈R. Some non-conventional courses c_∈Ce _such _as_Physical _Education, Art and Information Technologies can only be accommodatedin equipped classrooms Re

c such as gyms, art studiosand computer labs. The remaining conventional courses c_∈C_{− C}e _can _be ac-commodatedin traditionalclassrooms Rne _with_no _special equip-ment.Theclassroomcapacitiescanbe deﬁnedforactivities with-out knowing explicitly inwhich classrooms they are going to be planned(see AssumptionA1).The capacitiesofclassroomsKaset alimitonthenumberofstudentsthatcanbegroupedinsessions ofin-classactivitiesa∈A.

Sessions are taught by a set of teachers t∈T. Each session should be assigned to a teacher. Some teachers work part-time; inanytime blockb∈Bofanydayd∈Donlysometeacherst∈Tbd are available. Secondary school teachers in the Netherlands have two teachingqualificationlevels:firstandsecond level.The first-levelteachersT1

c ofcoursec∈Carequaliﬁedtoteachanylessonof anylearninggoalofcoursec.Thesecond-levelteachersTc− Tc1 of coursec∈C,ontheotherhand,arequaliﬁedtoteachthelessonsof themajorityoflearninggoals,butnottheonesofthehigherlevels a_∈A1

c.Itisnotdesiredthatteachersareassignedtoin-class activ-itysessionsinaweekmorethanaweeklylimitofWT.Apartfrom in-classactivities,teachersarealsoassignedtoself-study environ-mentinordertoguideself-studyactivitiesofstudents.Anyteacher

t∈Tcan provideguidance forup toSE many students’self-study activities.Suﬃcientteacherlevelsshouldbeassignedforself-study activitiesasmuchaspossibleinanytime block.Balancing teach-ers’workloadsinaweekisdesiredinbothin-classandself-study activityassignments.Forateacher,thein-class(self-study)activity workloadinaweekis measuredbythe utilizationrateofhis/her totalavailabletimewithin-class(self-study)activities.TeachersTc ofeach course c∈C desire to havesimilar in-class activity work-loads as much as possible among themselves. For the self-study activities,allteachersT,regardlessoftheir courses,desiretohave similarself-studyworkloads.

Lastly,schoolscanindicatepreferredtimeblocksBcforcourses to plan in-class activities relating to courses. For example, for courses that require high concentration levels, schools may pro-vide preferenceof morningtime blocksforthe activities relating tothesecourses.

Thisproblemseekstoproduce aweekly learningactivityplan attheendofeachweek,forthecomingweek,bydecidingonhow manysessionsofeachactivitytoplan(xabd),whichstudentsto as-signtotheplannedactivities(z_sabd)andwhichresourcestoassign tothe plannedin-class (ytbd) andself-studyactivities (ytbd). Note thatclassroomassignmentdecisionsareleftout(seeTable2),and teachersareonlyassignedtoin-class/self-studystateswithout be-ingassignedtospecificsessionsoflearningactivities,alsostudents are not explicitly assigned to specific activity sessions; they are onlyassignedtoactivities.Withtheflexibility informinglearning

(5)

Table 2 Decision variables

Variable Description

zsabd ∈ {0, 1} 1 if student s ∈ S is assigned to in-class activity a ∈ A 1s ∪ A 2s in block b ∈ B of day d ∈ D , 0 otherwise

xabd ∈ N number of sessions planned of in-class activity a ∈

s∈S A 1s ∪ A 2s in block b ∈ B of day d ∈ D

ytbd ∈ {0, 1} 1 if teacher t ∈ T is assigned to an in-class activity session in block b ∈ B of day d ∈ D , 0 otherwise

y_tbd_∈{ 0 , 1 } 1 if teacher t ∈ T is assigned to self-study environment in block b ∈ B of day d ∈ D , 0 otherwise

activitysessions,teachersdonothavepreferencesoverthein-class activitiesthattheyareassigned,studentsdonothavepreferences overtheteachers ofthesessionsandover thestudentsthat they areassignedtogether tothesessions,andactivitysessionsdonot havepreferencesovertheclassrooms.Thisallowsasigniﬁcant re-ductionin decisionlayers ofthe problem. The reduceddecisions canbereconstructedbyapost-processingprocedurewithout com-promising optimality. One such procedure is given in the online supplement.

Below we listour assumptions relatingto theuse ofteachers andclassrooms in thisplanning problem. These assumptions are conﬁrmedby an expertto be mostlyrealistic inthe case of the secondaryschoolsintheNetherlands.

• A1 : Each traditional classroom is identical to anyother tra-ditional classroom and each equipped classroom of a non-conventional course isidenticalto anyother classroomofthe samecourse.Thisassumptionestablishesadirectlinkbetween xabdandzsabdwithouttheknowledgeofclassroomassignments. Infact,thisassumptionisalsonecessaryforthedecision reduc-tionmadeontheclassroomassignments.

• A2:When studentsarenot assignedtoin-classactivities,they aredoingself-studyactivitiesintheself-studyenvironment. • A3: Thesize ofthe self-studyenvironment islarge enoughto

accommodateallstudentsinaschool.Therefore,physicalspace assignments for self-study activities are not included in the problem.

• A4: Each teacher t_∈T is specialized in only one course c_∈C. Note that, this assumption is not necessary for the mathe-matical model. However, it simpliﬁes the calculations in the feasibility checking phases of our constructive heuristics (see

Section4.2)andmakesthecoursebasedin-classworkload im-balancecalculations(seeSC2)moremeaningful.

The described problemseeks thedesired weekly activityplan thatsatisﬁesalldescribedhardconstraintsandviolatesasfewsoft constraintsaspossible.

3. MILPformulation

Theplanthatistobeproducedmustsatisfythefollowinghard constraints.

• HC1: Astudentcannotbe assignedtomorethanone activity sessionatanytime.

a∈A

zsabd≤ 1

∀

s∈S,

∀

b∈B,

∀

d∈D (1) • HC2:Astudentcanonlybeassignedtoasessionofanactivity

that(s)hedemands. b∈B d∈D zsabd≤1_{a∈A1 s}+1{a∈A2s}

∀

a∈A,

∀

s∈S (2)

• HC3:Thenumberofstudentsassignedtoactivitysessionsmust respectclassroomcapacities.

s∈Szsabd

Ka ≤ xabd

∀

a∈ A,

∀

b∈ B,

∀

d∈ D (3)

• HC4:A suitable teachermustbe assignedto each in-class ac-tivitysession. a∈A1 c xabd≤

|

Tbd∩Tc1

|

∀

c∈C,

∀

b∈B,

∀

d∈D (4) a∈Ac xabd≤

|

Tbd∩Tc

|

∀

c∈C,

∀

b∈B,

∀

d∈D (5) t∈T1 c∩Tbd ytbd≥ a∈A1 c xabd

∀

c∈C,

∀

b∈B,

∀

d∈D (6) t∈Tc∩Tbd ytbd= a∈Ac xabd

∀

c∈C,

∀

b∈B,

∀

d∈D (7) Constraints (6)and(7) wouldbe suﬃcient forsatisfyingHC4. However,constraints(4)and(5)arevalidinequalitiesthat pro-videa tighterLPrelaxation. Theyarealsoused inthe decom-positionheuristiclater(seeSection4.2.1).

• HC5:Ateachercan notbe assignedtomorethan oneactivity atanytimeand(s)hecanonlybeassignedif(s)heisavailable.

y_tbd+ytbd≤1{t∈Tbd}

∀

t∈T,

∀

b∈B,

∀

d∈D (8)

• HC6: A suitable classroom must be assigned to each in-class activitysession. a∈Ac xabd≤

|

Rec

|

∀

c∈Ce,

∀

b∈B,

∀

d∈D (9) a∈Ane xabd≤

|

Rne

|

∀

b∈B,

∀

d∈D (10) Notethat these constraintsare only for making surethat the activitydecisionsaretakensuchthattheywillbefeasiblewith respecttoclassroomresources. Speciﬁcclassroomassignments aredoneviaapost-processingprocedure.

• HC7:Lessons oflearninggoalshaveprecedencerelations; stu-dentsmustfollowthemintherightorder.

zsabd≤ a∈Aa (b,d)<(b,d)zsabd a∈Aa1{a∈A1s∪A2s}

∀

s∈S,

∀

b∈B,

∀

d∈D,

∀

a∈A1 s∪A2s s.t.

|

(

A1s∪A2s

)

∩Aa

|

=0 (11) Thenotation

(

b,d

)

<

(

b,d

)

isusedtodenoteallblocks

(

b,d

)

thatprecedetimeblockbofdayd.

In addition to the presented hard constraints, there are sev-eralsoftconstraintsregardingthequalityoflearningactivityplans. Thesesoftconstraintsdonothavetobesatisfied,howevertheyare desiredtobesatisfiedasmuchaspossible.Foreachsoftconstraint anauxiliaryvariableisdefined(seeTable3).Thesevariables mea-suretheviolationsofeachsoftconstraint.

• SC1:Satisfyinghighandlowprioritystudentdemands.

α

1 s ≥ a∈A1 s

1− b∈B d∈D zsabd

∀

s∈S (12)

α

2 s ≥ a∈A2 s

1− b∈B d∈D zsabd

∀

s∈S (13)

(6)

Table 3

Auxiliary variables.

Variable Description

α1

s ∈ N number of unmet high priority demands of student s ∈ S

α2

s ∈ N number of unmet low priority demands of student s ∈ S

α3

c ∈ R +₀ extent of imbalance in the in-class activity workloads among teachers in T c

α4

c ∈ N number of times that sessions of in-class activities in A c are planned in B − B c

α5

scd ∈ N extent of violating daily course limit of CL for student s ∈ S for course c ∈ C in day d ∈ D

α6

t ∈ N extent of violating weekly in-class assignment limit of WT for teacher t ∈ T

α7

bd ∈ N number of shortage teachers in self-study environment in time block b ∈ B of day d ∈ D

α8 ∈ N _{extent of imbalance in the self-study activity workloads among teachers T}

α9 ∈ N _{number of planned in-class activity sessions in a week}

• SC2:Balancingteachers’in-classworkloads.

α

3

c =

β

cmax−class−

β

cmin−class

∀

c∈C,

where

β

cmax−class,

β

cmin−class∈R+₀

β

min−class c ≤ b∈B d∈Dytbd b∈B d∈D1{t∈Tbd}

∀

t∈Tcs.t. b∈B d∈D 1_{t∈Tbd}>0,

∀

c∈C

β

max−class c ≥ b∈B d∈Dytbd b∈B d∈D1{t∈Tbd}

∀

t∈Tcs.t. b∈B d∈D 1_{t∈Tbd}>0,

∀

c∈C (14)

Themethodemployedhereformeasuringtheimbalanceisthe simplemethodoftakingthedifferencebetweenmaximumand minimumworkloads.

• SC3:Planningactivitysessionsintheirpreferredtimeblocks.

α

4 c = a∈Ac d∈D b∈/Bc xabd

∀

c∈C (15)

• SC4: Limiting the extent of exceeding students’ daily course limitofCL.

α

5 scd≥ a∈Ac b∈B zsabd− CL

∀

s∈S,

∀

c∈C,

∀

d∈D (16) • SC5:Limitingtheextentofexceedingteachers’weeklyin-class

assignmentlimitofWT.

α

6 t ≥ b d ytbd− WT

∀

t∈T (17)

• SC6: Assigning required numbers of teachers to self-study at anytime.

α

7 bd≥ s∈S

(

1− a∈Azsabd

)

SE − t∈T y_tbd

∀

b∈B,

∀

d∈D (18)

• SC7:Balancingteachers’self-studyworkloads.

α

8₌

_β

max−sel f₋

_β

min−sel f_, _where

_β

max−sel f_,

_β

min−sel f_∈_R+

0

β

min−sel f_≤ b∈B d∈Dy tbd b∈B d∈D1{t∈Tbd}

∀

t∈Ts.t. b∈B d∈D 1_{t∈Tbd}>0

β

max−sel f _≥ b∈B d∈Dy tbd b∈B d∈D1{t∈Tbd}

∀

t∈Ts.t. b∈B d∈D 1_{t∈Tbd}>0 (19)

• SC8:Minimizingthenumberofsessionsplannedinaweek.

α

9₌ b∈B d∈D a∈A xabd (20)

Thisconstraintimposestheeﬃciencyinplanning.

Wedeﬁneforeachpenaltyauxiliaryvariable

α

n

∗,n=1,2,...,9a

weightparameterwn_∈_R+_,_n₌₁_,₂_,_...,₉_and_formulate_the

follow-ingcost function,which istheweightedsumofviolations ofthe softconstraintsofourproblem.

MIN 2 n=1 s∈S

α

n swn+ 4 n=3 c∈C

α

n cwn+ s∈S c∈C d∈D

α

5 scdw 5 + t∈T

α

6 tw6+ b∈B d∈D

α

7 bdw7+ 9 n=8

α

n_wn ₍₂₁₎

Weshow that theweekly ﬂexible demand-driven learning ac-tivityplanningproblemisNP-hard.Infact, itcanbe shownthat manyoftheeducational timetablingproblems,whichare already provenintractableintheliterature, arespecialcasesofthis prob-lem.Forinstance,thestudentschedulingproblem,whichisproven NP-hard by Cheng, Kruk, and Lipman (2002), only assigns stu-dents to course sections(can be thought asthe activitysessions ofourproblem)tofulﬁllstudentdemandsandthereforeisa spe-cialcaseofourproblem, wheresessionsofactivitiesalreadyhave beenplannedandassignedtotimes,teachersandclassrooms. 4. Dynamicthompsonsamplinghyper-heuristicframework

TheMILPmodeldescribedinthe previoussection isnot solv-able by the state-of-the-art solvers within reasonable times for moderatesized instances. Consequently,we consideran approxi-mateapproach anddevelop ahyper-heuristic framework for pro-ducing solutions for this intractable problem. Hyper-heuristics, which are heuristic search methods that use heuristic methods to choose from a pool of simpler (low-level) heuristics, are al-ready in use to solve many computationally intractable educa-tionaltimetablingproblems(Ahmedetal.,2015;Burke,McCollum, Meisels,Petrovic,& Qu,2007;Pillay&Banzhaf,2009).Theuseof hyper-heuristics in educational timetablinghas recently been re-viewedbyPillay(2016).Thissectionpresentsourdynamic Thomp-son samplingsingle-solution selection hyper-heuristicframework. TheoverallsolutionmethodologyissummarizedinFig.2.

Differentlyfromtypical educational timetablingproblems,this planningproblemcontainsaverylargenumberofdecision dimen-sions. Thisfact enablesmany opportunities fordifferent solution methodsandstrategies.Forinstance,therearemanywaysof en-coding solutions andperforming search on various spaces. With themotivationofexploringmanysolutionapproaches,herewe de-scribe two different constructiveheuristics which we use as ini-tialsolutiongeneratorsforourhyper-heuristiclocalsearch.Infact, in the online supplement, we also brieﬂy discuss other heuris-ticsthatwetestforthisproblem;thesealsoincludegenetic algo-rithmswhichexploredifferentsolutionrepresentations.Alsowith theinvolvementofnumeroussoftconstraintsinourproblem, ex-actlynine many,the numberoflow-levelheuristicsinour hyper-heuristic framework is signiﬁcantly higher than that of a typical educationaltimetablingproblem.

(7)

Fig. 2. Solution methodology.

4.1.Solutionencoding

We usedirect encoding whichmaps eachstudent s∈S toone ofthe in-classactivitiesin A1

s∪A2s orto self-study,whichwe call “studentsolution”,andeach teachert∈T toin-class orself-study assignment,ortotheidlestate,which wecall“teacher solution”, foreverytime block b∈Bofevery dayd∈D.The numberof ses-sionsplannedofeachin-classactivitya_∈A,x_abd,isdetermined in-directly from the student solution as classrooms of activities Ka haveﬁxedsizes.

4.2.InitialSolutions

Ourhyper-heuristicframeworkapplieslocalsearchonafeasible good-qualityinitialsolution.Dueto thefactthat ourproblem in-volvesmanydecisions,itisdifficulttoconstructasolution heuris-ticallyinan integrative way. Inthis study,a feasible initial solu-tionisobtainedintwophaseswithgreedyconstructiveheuristics. Firstly,afeasibleinitialstudentsolutionisbuiltandthenthe gen-eratedstudentsolutionisusedtoconstructthecorresponding fea-sible teacher solution, to fulfill teacher assignment requirements fortheactivities planned withstudent assignments. Twodistinct constructiveheuristicsareutilizedtogenerategood-qualityinitial solutions. These heuristics differ in the first phase of generating studentsolutions.

A student solution can always be generated ina feasible way withrespect to teacher assignment requirements of activity ses-sions,withouttheexplicitdecisionsonteacherassignments(y_tbd, y_tbd). Specifically, thisfeasibilityrelates to HC4.Withassumption A4,thecomputationaleffortofcheckingthisfeasibilityisnot sig-nificant at all; only the available teacher levels in each course need to be in line withthe number of sessions planned of that course; constraints (4) and (5) are sufficient to check this feasi-bility,without the need ofconstraints (6) and(7). However, this decompositionof thesolution into studentand teachersolutions willaffectthesolutionqualityastherearesomedependencies be-tweenstudentandteachersolutions.Namely,thefirstlybuilt stu-dentsolution willdirectly affectthequality inthe self-study en-vironment,concerning the teachershortages in theenvironment, andalso theteachers’ weekly workloads and their workload im-balances. The proposed heuristics build student solutions before teachersolutionsbecausemeeting studentdemands isprioritized overother softconstraintsinpracticalinstances.Besides,the con-struction phase is only the first phase of our approach; the lo-cal search applied after thisphase isable to improvea solution withrespecttotheteacher-relatedqualitymetrics,aslocalsearch is made in an integrative manner on both student and teacher solutions.

4.2.1. Studentsolutions

This section describesthe two methods that areused to gen-eratefeasible student solutions. The detailed pseudocodes ofthe methodscanbefoundintheonlinesupplement.

Batchmethod:Thisheuristicplansafeasiblein-classsessionby selectinga learningactivityandan available timeblock of aday, then a number of students, up to the classroomcapacity of the selectedactivity,who canfeasiblybe assignedtotheselected ac-tivityattheselectedtimeareassignedinagreedyfashion,ateach iteration.Theprioritiesoftheactivitiestobeselectedatiterations aredeterminedbasedontheirpotentialtoreducethecostswhich relate to the student solution (costsdue to

α

1

s,

α

s2,

α

4c and

α

scd5 ). Thisprocedurestopswhentherearenopossibilitiestoplan feasi-blesessions.Thefeasibilityofplanninganewsessionofalearning activity is determined by the levels of the suitable teachers and classrooms of the course of the activity (bychecking constraints

(4)and(5)andconstraints(9)and(10)).

Decomposition method: This heuristic decomposes the student solution part of the MILP model per time block per day. The subproblem of each time block of each day is simple enough to be solvedby Gurobito almostoptimality quickly.This simpliﬁca-tionis mostlydueto thefact thatthe subproblemdoesnot con-tainthecomputationallychallenginglessonprecedenceconstraints

(12).Eachsubproblemconsistsofconstraints(1)–(5),(8)–(10),(12)

and(13),(15)and(16)and(20)withreducedtime blockandday dimensionsandan objectivefunction thatconsidersonlythe stu-dentsolution-relatedqualitymeasuresin(21).

4.2.2. Teachersolutions

A workload balancing constructive heuristic is used to build a teacher solution from a student solution. In this heuristic, ini-tiallyallteachersareconsidered idleineverytime blockofevery day.Foreveryblock b∈Bofeach dayd∈D,firstlytheteacher as-signmentsforsessions ofin-classactivities thatrequirefirst-level teachers aremade. Foreach course c_∈C,onlyfirst-level available teachers,t∈Tbd∩Tc1,areconsidered.Inthisprocess,teacherswho are least assigned to in-class activities are prioritized. This is to balance teachers’ in-class activity workloads. When assignments forthesessionsofactivitiesinA1

c aremade,assignmentsaremade forthesessionsofremainingactivitiesoncoursec∈Cinthesame way.Notethatanassignedteacherisnotconsideredavailable any-moreforlaterassignmentsinthesametimeblock.Thisisrepeated foreverycourse c∈C,andthen,lastly,a numberofteachers who are still available are assignedforself-studyactivities. This num-ber is basedon SE and thenumber ofstudents who are not as-signed toanyin-class activities intime block b ofdayd. Assign-mentsforself-studyarealsoperformedinasimilarfashionwhere teachers’currentself-studyworkloadsareconsideredfor prioritiz-ingteachers.Thisprocedureisrepeateduntilteacherassignments aremadeforevery block b∈Bofeachdayd∈D.Thepseudocode ofthisheuristicisalsogivenintheonlinesupplement.

4.3. Dynamicthompsonsamplinglocalsearchhyper-heuristic Wedevelop asingle-solutionhyper-heuristicframework which uses the dynamic multi-armed bandit algorithm of dynamic

(8)

Thompson sampling to improve a constructed initial greedy so-lution. Ourframework works asa local search methodon a sin-glesolutionwithapoolofpredeﬁnedlow-levelheuristicsthatact as neighborhood structures. Single-solution localsearch selection hyper-heuristics perform search ona singlesolution with heuris-ticselectionandmoveacceptanceprocessesuntilastopping crite-rionismet(Burkeetal., 2009).Theheuristic selectionprocess at each iterationselectsa low-levelheuristicfromthepool toapply onthecurrentsolution.After acandidatesolutionisfoundby ap-plyingtheselectedlow-levelheuristiconthecurrentsolution,the move acceptanceprocess decides whetherto acceptor rejectthe candidatesolution.

Inourframework,thedeterministicacceptancecriterion of ac-cepting only non-worsening solutions is selected. It is true that onlyacceptinggoodsolutionslimitsthescopeofthesearch space and move acceptance strategies such as simulated annealing or threshold acceptance that also accept some worsening solutions can be useful forescaping local optima. However, our investiga-tions couldnotfindthebenefitsofusingtheseacceptance strate-giesin thisproblem.This islikelyto be relatedtothe issuethat the search space of this problem is extremely large that it may taketremendouscomputationaleffortforthesestrategiestomake afineexplorationofthesearchspace.Therefore,welimitourselves tolocaloptimasolutions.However,itisimportanttonotethatthis doesnotleadtoamyopicsearch,sinceourframeworkconsists22 neighbourhoods.

Hyper-heuristic frameworks that use learning to guide the heuristic selection process use historical performances of low-levelheuristicsasguidance.Whenlearningtakesplaceduringthe search process of an instance, frameworks are classiﬁed as on-linelearninghyper-heuristics(Burkeetal.,2009).Here,adynamic Thompsonsampling-basedonlinelearningmechanismispresented to guide the heuristic selection process, to select an appropri-ate low-level heuristic at each iteration of the search. The dy-namic Thompson sampling (DTS) algorithm, which is introduced by Gupta, Granmo, and Agrawala(2011) to solve dynamic multi-armedbanditproblems,isintegratedintheheuristicselection pro-cessofourframework.

Wearrivedatintegratingthislearningalgorithminour frame-work by recognizing the parallelism between “the search game”, selecting aheuristic ateach iteration toreach a goodsolution at theendofthesearchindynamicsearchspaces,and“thegambler’s game”,selectinganarmtopullateachsteptoreachastatewitha highrewardindynamicenvironments(forasimilarparallelismsee

Fialho, Da Costa, Schoenauer, & Sebag (2010)). Multi-armed ban-dit problems are concerned withthe balance ofexploitation and exploration ingames. The typicalmulti-armedbanditproblemof static environments considers that a single player, the so-called gambler, chooses an arm to pull froma givenset ofarmswhich are associatedwithunknown probabilistic rewardmechanismsat eachstepinasequentialgame,inordertomaximizehis/hertotal expectedrewardattheendofthegame.Thegamblerlearnsabout the rewarddistributions ofthearmsastime passes, inan online fashion, which (s)he can exploit for the next steps ofthe game. However, thegambler mayalsochoosetoincreasehis/her knowl-edge ofthe rewardmechanismsof thearmsby exploring. Inthe staticversion,therewardmechanismsofthearmsdonotchange in time such that there isa best arm that thegambler wants to discover.However,inthesearchgamesofheuristicsthereisnota singleheuristic/operatorthatwouldbebestforanytime(DaCosta, Fialho,Schoenauer,&Sebag,2008)foranysolution.Hence,the dy-namicversionofmulti-armedbanditproblemismoresuitable for buildingparallelism forthe search game.Manydynamicversions of multi-armed bandit problem algorithms such as Upper Con-ﬁdence Bound (UCB) banditalgorithm are already tested (Fialho et al., 2010) for guiding the search processes. In this study, the

performance of the DTS algorithm in Gupta et al. (2011) as a heuristicselectoristestedtoexploremoreonhowdynamic multi-armedbandit based algorithms perform as operator/heuristic se-lectorsinsearchalgorithms.

TheDTSalgorithminGupta etal.(2011)isintroducedfor dy-namicbanditproblemsinwhichrewardprobabilitiesofthe beta-Bernoulli armsare Brownian motionprocesses. This algorithm is an order statistics-based Thompson sampling which tracks dy-namic changes in reward probabilities with an exponential ﬁl-tering technique. Gupta et al. (2011) demonstrates that the DTS algorithmoutperformsThompsonsamplingandtwoUCBbased al-gorithms for dynamic bandits. Our framework considers a beta-Bernoullibanditfortheheuristicselectionprocess;rewardsof low-level heuristicsfollow Bernoulli distributions and reward success probabilities ofheuristics followbeta distributions. When a low-level heuristic improves the current solution, it is considered a success andthe heuristic is rewarded. This mechanism doesnot considerthe extent of improvements.Our choice isdeliberate to give fairerchancestothe low-levelheuristics. Forinstance,some low-levelheuristicsthatactontheteachersolutions,althoughnot havinggreaterchancesofimprovingthecurrentsolutiontoagreat extentimmediately,createhigh-improvementopportunitiesforthe succeedinglow-level heuristicsthat act onstudent solutions.Our heuristic selection process uses the expectation values of beta-distributedrewardsuccessprobabilitiesoflow-levelheuristics.

Algorithm1presentsthepseudocodeofourframeworkof dy-namic Thompson sampling based single solution hyper-heuristic

Algorithm 1 Pseudocode of the dynamic Thompson sampling hyper-heuristic(DTSHH)framework.

1: Initialize CDTS _and

_α

k_,

_β

k_, _for _k₌₁_:_N; _Pool_←

{

LLH1,. . .,LLHN

}

;

2: Scurrent←GenerateInitialGreedySolutions;

3: fcurrent←CalculateOb jecti

v

e

(

Scurrent

)

;

4: whiletime&iterationlimitnotreacheddo

5: h_←Find_k_∈₁_..._N s_.t_. αk

αk₊_βk =maxn∈1,...N α n

αn₊_βn; 6: reward←0;

7: Scandidate←Apply

(

Scurrent,LLHh

)

;

8: f_cand_id_ate_←CalculateOb jecti

v

e

(

S_cand_id_ate

)

;

9: if fcandidate≤ fcurrent then

10: Scurrent←Scandidate; fcurrent←fcandidate;

11: if f_cand_id_ate_< fcurrent then

12: reward←1; 13: endif 14: endif 15: if

α

h₊

_β

h_<_CDTS_then 16:

α

h_←

_α

h₊_reward;

_β

h_←

_β

h₊

₍

₁_{− reward}

₎

_; 17: else 18:

α

h←

(α

h+reward

)

CDTS CDTS₊₁;

β

h←

(β

h+

(

1− reward

))

CDTS CDTS₊₁; 19: endif 20: endwhile 21: return Scurrent

(DTSHH).Thisframework usesthreeparametersthatrelate tothe DTSalgorithm.CDTS_denotes_a_threshold_value_that_reﬂects_for_how longto postpone the trackingof changes inreward probabilities. Theremainingparametersareinitializationofbetadistribution pa-rameters

α

k_and

_β

k_for_each_low-level_heuristic_k_._Note_that_when

CDTS _is _suﬃciently _large, _our _heuristic _selection _process _will be-have as a traditional Thompson sampling algorithm, which does nottrackthechangesatall,byonlyusingtheﬁrstsetof parame-terupdaterules,i.e.,lines18–19ofAlgorithm1.

(9)

We classify ourlow-level heuristicsintwo groups:(1)generic and(2)tailor-made low-levelheuristics. We have 12 generic low-level heuristics which are often used in local-search (pertur-bative) selection hyper-heuristics in solving various educational timetablingproblems(Pillay, 2016).These heuristicsapply simple mutation (e.g., move and swap) and hill-climbing operations to perturbacurrentsolution.

Additionallytothese,10tailor-madelow-levelheuristicsare de-velopedwhichaimtoreduceviolationsofspeciﬁcsoftconstraints ofourplanningproblem. Ourtailor-madelow-levelheuristics de-liberately guide to neighbourhoods that may potentially contain bettersolutions,asopposedtogenericswhichdonotuseany guid-ancetodirecttheiroperations.Someofthesetailor-madelow-level heuristicsfocus on improving the demand satisfaction measures. Theseheuristics try to satisfy unmet student demands either by constructing newin-class learning activities or by increasing the utilizationofalreadycreatedactivities.The remainingtailor-made low-levelheuristicsfocusonreducing the violationsofother soft constraints.Eachoneofthesefocusesonaspeciﬁcsoftconstraint andtriestoreduceitsviolationbydestructingacurrentsolution.

The size of thislow-level heuristic pool is considerablylarge, comparedtoexistinghyper-heuristics andalsolocalsearch meth-odsintheliterature,ingeneral.Thefact thattherearemanysoft constraintsinvolvedinourproblemandalsothefactthatstudent andteachersolutionsaresearchedinanintegrativewayduringthe localsearch havecreatedtheneedfordevelopingmanylow-level heuristics.Naturally,thelargesizeoftheheuristicpoolmakesthe taskofheuristicselectionmoreimportantfortheeffectivenessand eﬃciencyofthe search process, requiring an intelligent selection mechanism.Belowwebrieﬂyexplaintheselow-levelheuristics.

• LLH₁generic: It randomly picks a student and a time block of a day,anditrandomlychangestheplanofthepickedstudentto a randomly picked in-classactivity that the student demands ortoself-studyinthepickedtime.

• LLH₂generic:Itrandomlypicksan availableteacherinarandomly pickedtimeblockofaday,anditrandomlychangesthe assign-mentstateofthepickedteachertoeitherin-classorself-study statesinthepickedtime.

• LLH₃generic: It randomly picks a student and a time block of a day,andchangestheplanofthepickedstudenttothein-class activityorself-studythatresultsinlargestcostreductioninthe objectivefunction.

• LLH₄generic:Itrandomlypicksan availableteacherinarandomly pickedtimeblockofaday,andchangestheplanofthepicked teacher to an assignment state that results in largest cost re-ductionintheobjectivefunction.

• LLH₅generic:Itrandomlypicksastudentandtwodifferenttimes, andswapstheplansofthepickedstudentinthesetimes. • LLH₆generic: It randomly picks a teacher who is assigned to

in-class statein arandomly pickedtime block ofa dayand ran-domly picks another teacher who is a teacher of the course that theﬁrstlypickedteacherteaches andiseitheridleor as-signed toself-studystate. Then,this heuristicassignstheﬁrst teachertoidlestateandthesecondteachertoin-classstatein thepickedtime.

• LLH₇generic: It randomly picks a student and a time block of a day,anditrandomlychangestheplanofthepickedstudentto arandomlypickedin-classactivityofhis/herdemandsuchthat the assignment ofthe studentwould not requirethecreation ofanewactivitysession.

• LLH₈generic: Itrandomly picksa student.Then, forassigningthe studenttoanin-classactivitythatthestudentdemands,itﬁnds a random time wherethe studentis assignedtoself-study in which the student can be feasibly assigned to the picked in-classactivity.

• LLH₉generic: It randomly picks a student anda time block of a dayinwhichthestudentisassignedtoanin-classactivity.The aimistomove thisassignmentto anothertime.The heuristic randomly selectsa differenttime for the student to be feasi-bly assignedto the activity that (s)heis assigned in the ﬁrst pickedtime. Then,thestudent isassignedtoself-studyinthe ﬁrstpickedtimeblock.

• LLH₁₀generic:Thisheuristicswapstwo studentswho areassigned totwodifferentsessions,atdifferenttimes,ofthesamein-class activity.

• LLH₁₁generic: Thisheuristic swaps two randomly picked students who are assignedto two differentin-class activities in a ran-domlypickedtimeblockofaday.

• LLH₁₂generic: This heuristic applies a hill-climbing greedy local searchonthecurrentsolutionforseekingimprovement oppor-tunities.Foreachtime block ofeachdayit visitsthestudents ina ﬁxed orderand assignsthe best feasible activity options forthem,whichwillreducethecostsbest.

• LLHtailored

13 : Thistailor-made heuristic actsonstudent solutions

to reduce the violations of exceeding students’ daily course limit.Foreverystudentandeveryday,itreducesonerandomly pickedin-classactivitysessionofacoursefromthestudentin whichthestudent hasexcessactivities assignedtothe course onaday.

• LLHtailored

14 : Thistailor-made heuristic acts onteacher solutions

toreduce theviolationsofexceedingteachers’ weeklyin-class assignmentlimit.Foreveryteacherwithanoverloadedweekly in-classassignment,itreducesoneoftherandomlypicked in-classassignmentfromtheteacherbyassigninghim/hertoidle state.

• LLHtailored

15 : Thistailor-made heuristic acts onteacher solutions

toreducetheteachershortageintheself-studyenvironment.It randomlypicks atime blockofa day,andassignsa randomly pickedidleandavailableteachertoself-studystate.

• LLHtailored

16 : Thistailor-made heuristic actsonstudent solutions

to reduce the number of unmet in-class activity demands of students by increasing the utilization of already planned in-classactivitysessions.Itrandomlypicks atimeblockofaday, andforeachplannedin-classactivityinthepickedtime,it as-signsarandomlypicked studentwhoisassignedtoself-study butwithanunmet demandon theactivityto thein-class ac-tivity, if thisassignment doesnot require the planning of an additionalsessionoftheactivity.

• LLHtailored

17 :Thisheuristicisverysimilartothepreviousone.

Dif-ferently,thisheuristicalsoconsidersstudents whoarealready assignedtosome in-classactivities atthepicked time, for as-signingtotheactivitiesthathaveexcesscapacities.

• LLHtailored

18 : This heuristicalso worksfor increasing the

utiliza-tion of already planned activitysessions. It randomly picks a timeandan activity,then itassignsa randomnumberofidle students,whocanbefeasiblyassigned,totheselectedactivity attheselectedtime.

• LLHtailored

19 : Thistailor-made heuristicactsonboth studentand

teachersolutions to reduce the numberof unmetin-class ac-tivitydemandsofstudents.Itrandomlypicksatimeblockofa dayandastudentwho isassignedtoself-studyatthe picked time,andrandomly assignsthe studentto an in-classactivity whichthestudent hasa demandbutis notassigned.Then, it alsoassignsarandomly pickedavailable teacher,who is qual-iﬁedto teach the activityand not earlierassigned to in-class assignmentstate,toin-classstate.

• LLHtailored

20 : Thistailor-made heuristicactsonboth studentand

teachersolutions to reduce the numberof unmetin-class ac-tivity demands of students by adding a new in-class activity session.It randomly picks a time block ofa day, and plansa

(10)

newsessionofanin-classactivitybyassigninganidleavailable teacherwhoisqualiﬁedtoteachtheactivitytoin-classactivity stateandanumberofstudentswhoareassignedtoself-study onthepickedtimebuthaveunmetdemandsontheactivityto thein-classactivity. Asessionofthein-classactivitywiththe largestpotential,theactivitythatcanprovidedemand satisfac-tion forthe largestgroup ofstudents, is picked. Anumberof studentsarerandomlyassignedtothepickedactivityuptothe classroomcapacityoftheactivity.

• LLHtailored

21 : This tailor-made heuristic is very similar to the

previous one. Differently from the previous heuristic, in the teacherassignmentphase,thisheuristicalsoconsidersnon-idle teachers.

• LLHtailored

22 :Thisheuristicalsosearchesforopportunitiesto

cre-atenewin-class activitysessions. Forthispurpose,it uses an iterationofthebatchheuristic.

5. Computationalexperiments

Thissection ﬁrstlyintroduces thecharacteristics ofthe bench-markinstances used totest ourapproaches.The ﬁrstexperiment focuses on the performance of our method against Gurobi MIP solver. The second experimentpresentsourbenchmark solutions, along with the performance comparisons of the proposed con-structiveheuristicapproaches. Thelastexperimentfocusesonthe theperformanceofthelocalsearch.Thissectionhighlightsthe im-portantresultsandpatternsoftheexperiments.Foreaseof read-ing,detailedoutputsoftheexperimentsaregiveninTables1–7of theonlinesupplement.

Inalloftheexperiments,anIntelXeon2.5GHzprocessorwith 128 GB memory is used. 50 runs are completed in each non-deterministicsetting.Eachlocalsearchrun islimitedbyonehour of runningtime andeach run is alsolimitedto performat most 50 000non-improving iterations. The DTS parameters are oﬄine tunedtothefollowingvalues:CDTS₌₆₀_,

_α

h₌

_β

h₌₃_,

_∀

_h_.

5.1. Dataset

This study is an exploratory work for personalized learning models where students master their learninggoals attheir own pace which results in demand-driven learning activity plans in schools.Inourresearch,wecollaboratewiththeZo.Leer.Ik!(https: //www.zoleerik.nl/) secondary schools network, which currently experiments with various personalized learning models in the Netherlands. There are currently22 schools in this network. Ac-cording to the experts from VO-raad(Dutch branch organization for secondary education), severalschool networks inthe Nether-lands work on implementing personalized learning. Those net-works, including Zo.Leer.Ik!,consist of at least 90schools in to-tal.Duetothefactthat theschoolsinthisnetworkhaverecently startedwithpersonalizedlearningimplementations,sufficientdata on student demands are not yet available. We therefore gener-ateartificialinstancesforthisproblemthatreflectthesize-related characteristicsoftheschoolsinthisnetwork(e.g.,numberof stu-dents, number of teachers, number of classrooms, etc.). The de-mandscenariosthatweconsiderinourinstancesarebasedonan expert’sopinions fromthe network.The student demands inthe instancesaregeneratedbytheconsiderationoffourdemandspread andtwodemandlevelscenarios.

Demand Spread: In traditional educational models, students are grouped intoﬁxed ageor levelgroups. In our instances, this concept is againused to consider differentdemand clustersover the activity set, although in personalized learning there are no longerﬁxedgroups.Howeverinourpersonalizedspreadscenarios we considervariety in demands within these conceptual groups.

Fig. 3. The illustration of four demand spread scenarios in a course.

Thefollowingscenariosareconsidered forthespreadinstudents’ demands.

• T:Thisisthetraditionalsituationinwhichtheagesdetermine students’ learningpaces.Thisscenarioassumesthat withinan agegroupthereisnovarietyobservedinthestudents’learning paces. The traditionaldemand scenario isincluded inour ex-perimentsforthesakeofcomparingtheoutcomesof personal-izeddemandswithtraditionaldemands.

• P: This is the ﬁrst personalized demand scenario. It assumes thatwithinanagegroup,learningdemandsarespreadover ac-tivitiesthatrangeovertwolearninggoals.

• 3P1: This personalized demand scenario assumes that within any agegroup, three distinct student groups canbe observed asaresultoftheirlearningspeeddifferences.Thethreegroups representthe slow-,average-andfast-pacing studentsineach agegroup.Average-pacegroupsareassumedtobethelargest. The demandsofthe averagespeedgroupsare spreadover ac-tivities ranging over two learning goals, just like in the ﬁrst personalized demand scenario. On the other hand, the small groupsofslowandfastpacingstudentsdemandactivitiesthat arespreadoveronlyonelearninggoal.

• 3P2:Thisscenariois verysimilartothe previouspersonalized demand scenario. The only difference of this scenario to the previous one liesinthe sizesoflearningspeedgroupswithin eachagegroup.

Fig.3illustratesthedifferencesinthesefourspreadscenarios. Thisﬁgureshowsthelearningdemandsofarepresentativecourse in the case of T, P, 3P1 and 3P2 demand spread scenarios. Nor-maldistributions are used fordistributing studentdemands over lessonsforthecasesthat relatetopersonalizedlearningdemands inour instances.Our instances are inlinewith thesix-year-long secondaryeducationprogramintheNetherlands.

DemandLevel:In traditionalmodels,students useall oftheir available school time |B× D| in a week onlyforcourse meetings. This is usually between 30 and 40 h in a week. In contrast, in personalized learning, students also learn through self-study ac-tivitiesin schools. This would translate into fewer in-class activ-itydemands.Thefollowingdemandlevelscenariosareconsidered where each student is assumed to be enrolled in six or seven courses.

• 12: Each student demands in total 12 in-class activities per week,twoactivitiespersixofhis/hercourses.

(11)

Table 4 Weight parameters. weight ( w n ) ( w 1 ) Setting 1 ( w 2 ) Setting 2 w1 ₁₀₀₀ ₁₀₀₀ w2 ₅₀₀ ₅₀₀ w3 ₃₀₀ ₃₀₀ w4 ₅₀ ₅₀ w5 ₄₀₀ ₂₀₀₀ w6 ₄₀₀ ₂₀₀₀ w7 ₅₀ ₅₀ w8 ₃₀₀ ₃₀₀ w9 ₂₀ ₂₀

• 21: Each student demands in total 21 in-class activities per week,threeactivitiespersevenofhis/hercourses.

School Size: Four school sizescenarios are considered : small (S),small_200(S200_),_medium_(M)_and_large_(L)._The_medium_size

reﬂectstheaverageschoolinthecollaborationnetwork.Other sce-nariosareobtainedbytherough linearizationofthemediumsize (S200_instances_do_not_exactly_follow_this_pattern,_explanations_for

thisaregiveninSection5.2).Infact,SandS200 _scenarios_are_not

realistic,howevertheyareusefulinourexperimentstoinvestigate theoptimalitygapsofourheuristicapproach.

• S:100students,12teachersand12classrooms. • S200_:₂₀₀_students,₂₀_teachers_and_∞_classrooms.

• M:800students,80teachers,40classrooms. • L:2400students,240teachers,120classrooms.

WeightsofSoftConstraints:

Thisprobleminvolvesninesoftconstraints.Ourconcerninthis paper is to use realistic weight settings such that outcomes of ourexperimentswillbe alsomeaningfulforprovidinginsightsto schools, apart from testing and benchmarking our heuristic ap-proach.To achieve that, we benefit from the results of the two surveys which we conduct with the participation of the school managersofsome personalizedlearningsecondary schoolsinthe Netherlands.Inthefirstsurvey,whichisconductedduringa work-shopsessionatthe“VO-congress,March28,2019”,28participants weregivenmultiple-choicequestionsthatquantifytherelative im-portancelevels of giventwo soft constraints, while in the latter surveyseven participants fromtheZo.Leer.Ik!network quantified the importance of each soft constraint by directly setting their weights. The results of thesesurveys almost match withrespect totheimportanceorderofsoftconstraints.However,onlyone sig-nificantdifferenceisrealized,theimportancelevel ofoverloading teachers andstudents, namelythe weights of w5 _and _w6 _in _the

results.Thesecond survey suggestsconsiderablyhighimportance levels for thesesoft constraints compared to the ﬁrst survey.As a result, we conduct our experiments by considering two differ-entweightsettings,eachrepresentingonesurvey.Table4declares thesesettings;Setting1correspondstotheresultoftheﬁrst sur-vey,whiletheothertothesecond.

We denote the instance with x1∈{T, P, 3P1, 3P2} demand

spread,x2∈{12,21}demandlevel,x3∈{S,S200M,L}schoolsizeand

x4∈

{

w1,w2

}

weightsettingas“xx14x₃,x2

” inthisdocument.The de-tails of how these instances are generated are described in the online supplement. The description of our data set in the on-line supplement also explains the data format of our instances, which can be accessed at https://drive.google.com/drive/folders/ 1OsJ5CxYNK9lPj8-yqGvn0Nz1ADPqs5lv?usp=sharing.

5.2.Benchmarkingagainstasolver

Our heuristic approach is benchmarked against Gurobi 7.0.2 MIPsolverforperformanceinvestigation.Thesmallschoolsize in-stances (S) are used forthis investigation, even though they are

not realistic cases,because ofthe limitationofthe solver toﬁnd optimal ornearly optimal solutions in the considered time limit offour days.Infact, thistime limit is toolong forpractical pur-posesbecause inpracticea schoolneeds tosolve theproblemof theupcomingweek duringtheweekend.WecomparetheGurobi solutionswiththebestsolutionsobtainedfromtheheuristics.The comparisonsaregiveninTable1oftheonlinesupplement.Firstly, weobservethatGurobiisnotabletoevenﬁndgoodsolutionsfor the two of the instances: Tw1

S,21 andTw 2

S,21. For theseinstances, the

gapsoftheGurobisolutionstothebestboundsthatGurobiﬁnds aremorethan85%andtheheuristicsolutionsaresigniﬁcantly bet-ter than Gurobi’s.Weargue that thereasonforthiscould be re-lated to the increased numbers of feasible activity group forma-tions ofthe traditional demand scenario. This also demonstrates the computational challenge of ourproblem numerically.For the remaininginstances,wefoundthattheheuristicsolutionshaveon average15.72%optimalitygap,comparedtothebestlowerbounds foundbyGurobi.

Inorder toexperimentwithlarger instances,some simpliﬁca-tions are made. We use another set of instances which has 200 students(S200₎_for_this_purpose._In_these_instances,_the_number_of

classroomsareassumedtobeunlimitedforeachcourse,all teach-ersarefirst-levelandalsothedemandlevelisonly12.Thesolver isagaingivenfourdaysofrunningtime.Infouroutoftheseeight instances,theheuristicsolutionsarebetterinqualitythenGurobi solutions.Forthesesimplified instances,thegapsofthe heuristic solutionstoGurobiboundsareobservedtobesignificantlysmaller comparedtotheinstanceswith100students;theaverage optimal-itygapoftheseinstancesis9.19%.Also,fortheinstanceswith3P2 spread scenario, the gaps are even lower than five percent. This couldbeanindicationthatthegapswillgetsmallerasthesizeof theinstancesgetlarger.Although,wearenotabletodemonstrate thisforthelargersizes,suchasMschoolsizeinstances,duetothe currentperformancesofthestate-of-the-artMIPsolvers,weexpect thatthiswillbecase.Ourintuitionisthatasschoolsizeincreases, demandsatisfactionwillgeteasierbecausetheincreasingnumber of students can be grouped inactivities andlimited teacherand classroomresources couldbeusedmoreefficiently,giventhatthe activitysetiskeptfixed.Thedetailedexplanationforthissituation isgivenlaterinSection5.3.2.

5.3. Solutions

This section provides the solutions for the medium and large size instances for benchmarking purposes and also an analysis of two differentinitialization methods. The detailedperformance measures of constructive heuristics are given in Tables 2 and 4, while the solutions of ourbenchmark instances can be found in

Tables3and5oftheonlinesupplement. 5.3.1. Initialsolutions

RunningTimes:

Inordertoseetherunningtimepatternsofthebatchand de-composition methods across four demand spread scenarios (T, P, 3P1and3P2)inFig.4wepresenttheaveragerunningtimesover all instancesofeach spreadscenario, foreachschool size.The y -axesofthegraphsinthisﬁguregivetheseaverageCPUvaluesthat aremeasured inseconds.The runningtimesofthe batchmethod areconsiderablylowercomparedtothedecompositionmethodin almostallinstances,withsomeexceptionsinthecaseoflargesize instances. This is expectedas decompositionmethod utilizes the solver for small-scale subproblems. Although these problems are quickly solved, there are 40 of them in our instances. Moreover, therunningtimesseemtobe correlatedwiththedemandspread scenariosinbothmethods.Theinstanceswithhigh-spread scenar-ios,3P1and3P2,alwaysrequiremoretimeinbothmethods.Note

(12)

Fig. 4. Average CPUs of batch and decomposition solutions over spread scenarios. M (in blue) and L (in yellow) mark medium and large school size instances, respectively. (For interpretation of the references to color in this ﬁgure legend, the reader is referred to the web version of this article.)

Fig. 5. The objective values initial solutions of medium school size instances with 12 demand level.

Table 5

Overall performance measures of initial solution heuristics in medium and large school size instances (note: each set has 16 instances).

School size Batch Decomposition

M Ave: 261,724.19 #Best: 9 Ave: 207,407.24 #Best: 7 L Ave: 253,139.19 #Best 12 Ave: 269,622.38 #Best: 4

that asspreadincreases,thedemandedsetofactivitieswillgrow whichwillresultinincreasedvariablesforactivityplanning(xabd), increasingthetimerequiredtoconstructthesolutions.

SolutionQualities:

Table 5 provides overall performance measures of the batch and decompositionmethods, with respect to the qualitiesof the solutions they produce. “Ave” gives the average objective values of the solutions found by the corresponding method of the in-stances withthe corresponding schoolsize scenario,while “Best” gives the number of times that the corresponding method pro-ducesthebest-qualitysolutions intheconsideredinstanceset.In both medium and large size instances, the initial solutions pro-ducedbythebatchmethodaremostlybetterinqualitythanthose produced bythedecompositionmethod.However, iftheaverages ofthesolutionsfoundbythesemethodsarecompared,the decom-positionmethodisbetterthanthebatchmethodinmediumschool sizeinstances.

Theperformanceofthesemethodswithrespecttothequalities ofthesolutionstheyproduceformpatternsoverthecharacteristics ofthe instances. Inorder to illustratethese patterns,we present

Figs. 5and6 whichshow theobjective valuesfound by thetwo initializationheuristicsforthemediumschool sizeinstanceswith 12 and 21 demand levels, respectively. It can be observed from theseﬁgures that the batch method isalways better for thelow demand level instances, while in the case of highdemand level scenario, the decompositionmethod is almost always better. We alsoobservethatinthehighdemandlevelscenarioinstanceswith Setting2weights,thedecompositionmethodalways outperforms thebatchmethodsigniﬁcantly.Theweaknessofthedecomposition methodisthatit makestime block baseddecisionsanddoesnot regard the resource availability ofthe latter time blocks when it ismaking activity assignment decisionsfor ablock. Forinstance, this method can make undesirable activity group formations for thesake ofsatisfyingstudent demands ina time block, although therearebetterformationopportunitiesinthefuturetimeblocks. Infact,thisisthelikely reasonfortheworse performance ofthe decompositionmethodinthelowdemandlevelinstances.Inthese instances,theopportunitiestosatisfystudentdemandsinaweek are not scarce. Lastly, theseexperiences withour instances indi-catetheissuethat theinstancecharacteristicsplaysan important roleintheperformance ofourheuristicsforthisproblem. There-fore, we can not conclude that batch or decomposition method performs best overall. However, based on our computational