• No results found

Aqueous humor proteome of primary open angle glaucoma: A combined dataset of mass spectrometry studies

N/A
N/A
Protected

Academic year: 2021

Share "Aqueous humor proteome of primary open angle glaucoma: A combined dataset of mass spectrometry studies"

Copied!
7
0
0

Bezig met laden.... (Bekijk nu de volledige tekst)

Hele tekst

(1)

Contents lists available at ScienceDirect

Data

in

Brief

journal homepage: www.elsevier.com/locate/dib

Data

Article

Aqueous

humor

proteome

of

primary

open

angle

glaucoma:

A

combined

dataset

of

mass

spectrometry

studies

W.H.G

Hubens

a,b,∗

,

R.J.C

Mohren

c

,

I

Liesenborghs

a,d

,

L.M.T

Eijssen

b,e

,

W.D

Ramdas

f

,

C.A.B

Webers

a

,

T.G.M.F

Gorgels

a,∗

a University Eye Clinic Maastricht, Maastricht University Medical Center, Maastricht, the Netherlands b Department of Mental Health and Neuroscience, Maastricht University, Maastricht, the Netherlands c Maastricht MultiModal Molecular Imaging (M4I) Institute, Division of Imaging Mass Spectrometry, Maastricht

University, Maastricht, the Netherlands

d Maastricht Centre of Systems Biology (MaCSBio), Maastricht University, Maastricht, The Netherlands e Department of Bioinformatics - BiGCaT, NUTRIM, Maastricht University, Maastricht, the Netherlands f Department of Ophthalmology, Erasmus Medical Center, Rotterdam, the Netherlands

a

r

t

i

c

l

e

i

n

f

o

Article history: Received 2 June 2020 Revised 4 September 2020 Accepted 15 September 2020 Available online 21 September 2020

Keywords:

Primary open angle glaucoma Aqueous humor

Human Proteome

Liquid chromatography tandem mass spectrometry

a

b

s

t

r

a

c

t

Analysis of the proteins of the aqueous humor can help to elucidate the complex pathogenesis of primary open angle glaucoma. Thanks to advances in liquid chromatography tan- dem mass spectrometry (LC-MS/MS) it is now possible to identify hundreds of proteins in individual aqueous humor samples without the need to pool samples. We performed a systematic literature search to find publications that per- formed LC-MS/MS on aqueous humor samples of glaucoma patients and of non-glaucomatous controls. Of the seven publications that we found, we obtained the raw data of three publications. These three studies used glaucoma pa- tients that were clinically similar (i.e. undergoing glaucoma filtration surgery) which prompted us to reanalyse and com- bine their data. Raw data of each study were analysed sepa- rately with the latest version of MaxQuant (version v1.6.11.0). Outcome files were exported to Microsoft Excel. Samples be- longing to the same patient were averaged to obtain peptide expression values per individual. We compared the overlap

DOI of original article: 10.1016/j.exer.2020.108077 ∗ Corresponding authors.

E-mail addresses: w.hubens@maastrichtuniversity.nl (W.H.G Hubens), theo.gorgels@mumc.nl (T.G.M.F Gorgels).

https://doi.org/10.1016/j.dib.2020.106327

2352-3409/© 2020 The Author(s). Published by Elsevier Inc. This is an open access article under the CC BY license ( http://creativecommons.org/licenses/by/4.0/ )

(2)

of identified proteins using the VLOOKUP function of Excel and a publicly available Venn diagram software. For the pep- tide sequences that can belong to multiple proteins (usually of the same protein family), we initially included all possi- bly identified proteins. This ensured that we would not miss a potential overlap between the studies due to differences in identified peptide counts. Next, of those peptides of which we compared multiple proteins, only one unique protein was included in our analysis i.e. either the protein overlapping between studies or in case of no overlap, the protein that had the highest identified peptide count. This yielded 639 unique proteins detected in aqueous humor of either glaucoma pa- tients or non-glaucomatous controls. In our manuscript en- titled “The aqueous humor proteome of primary open angle glaucoma: An extensive review”[1], we further analysed this dataset. The dataset was exported to Perseus (version 1.6.5.0). We removed contaminants and filtered for proteins detected with high confidence, i.e. in more than 70% of the samples of at least one study. This yielded 248 proteins of which we compared the expression in glaucoma patients against con- trol patients. Gene ontology enrichment analysis and path- way analysis was used to interpret the results. The unfiltered dataset reported in this data article and the approach re- ported here to reanalyse and combine raw data of different studies can be applied by other glaucoma researchers to gain more insight in the pathogenesis of glaucoma.

© 2020 The Author(s). Published by Elsevier Inc. This is an open access article under the CC BY license ( http://creativecommons.org/licenses/by/4.0/)

SpecificationsTable

Subject Ophthalmology

Specific subject area Aqueous humor proteome of primary open angle glaucoma

Type of data Table

How data were acquired Raw data were obtained from ProteomeXchange, a publicly available database and reanalysed with the freely available MaxQuant software (Max Planck Institute version v1.6.11.0). During our study, dataset PXD004928 was not yet publicly available and we obtained the raw data after contacting the authors. Microsoft Excel was used to combine the files. Subsequently, we imported the combined dataset in Perseus (Max Planck Institute version 1.6.5.0) for filtering and statistical analysis.

Data format Raw

Analysed Filtered

Parameters for data collection We performed a systematic literature search to find studies that investigate the proteome of aqueous humor from patients with glaucoma compared to non-glaucomatous controls. We considered only studies that included glaucoma patients without other ocular comorbidities. This meant that from the 9 proteomic studies we found, 7 were eligible to obtain the raw data. We managed to obtain the raw data of three publications. They used similar glaucoma patients i.e. patients undergoing glaucoma filtration surgery, which prompted us to reanalyse their raw data and combine the outcome for new statistical analysis.

(3)

Subject Ophthalmology

Description of data collection We reanalysed the raw data of three publications that investigated the aqueous humor proteome of primary open angle glaucoma patients compared to non-glaucomatous controls, using LC-MS/MS. We downloaded the raw data from the depositories and subsequently loaded them into the MaxQuant software program (v1.6.11.0) for analysis. Analysed data were exported to Microsoft Excel to average duplicates and to combine the different studies into 1 protein database. This database was imported into Perseus analysis software (v1.6.5.0) to filter for proteins with high detection confidence and subsequent statistical analysis to compare glaucoma patients with controls. Data source location University Eye Clinic Maastricht

Maastricht Netherlands

Data accessibility RAW data were obtained from ProteomeXchange:

Dataset 1: “Human aqueous humor of Primary open angle glaucoma LC-MS/MS”; PXD007624;

https://www.ebi.ac.uk/pride/archive/projects/PXD007624 Dataset 2: “Comparative shotgun proteomics of aqueous humor for cataract, glaucoma and pseudoexfoliation eye disorders”; PXD002623; https://www.ebi.ac.uk/pride/archive/projects/PXD002623

Dataset 3: “Comparative evaluation of the aqueous humor proteome of primary angle closure and primary open angle glaucomas and senile cataract eyes”; PXD004928;

https://www.ebi.ac.uk/pride/archive/projects/PXD004928 Analysed data are included in this article

Related research article WHG Hubens, RJC Mohren, I Liesenborghs, LMT Eijssen, WD Ramdas, CAB Webers, TGMF Gorgels, The aqueous humor proteome of primary open angle glaucoma: an extensive review, Exp. Eye Res. 197 (2020) 108077

doi: 10.1016/j.exer.2020.108077

ValueoftheData

• This dataset provides the list of proteins present in the aqueous humor of primary open angleglaucomapatientsandcataractpatientsandfacilitatesextractionandquantificationof diseasespecificdifferences.

• Thisdatasetisarichresourceforglaucomaresearchersandpharmaceuticalcompanies inter-estedinunravellingtheproteomeofprimaryopenangleglaucoma.

• Thedatasetfacilitatespathwayanalysistoidentifynewglaucomapathwaysthatcanbe tar-geted inhumanoranimalstudies,withtheaimofestablishingnewbiomarkers ornew in-terventionsforprimaryopenangleglaucoma.

• The approach detailedheretoregroup, combineandreanalyse publicly available datamay beusefulforotherstudiesondatainpublicdatabases.

1. DataDescription

Fig.1:

Fig.1isaflowchartthatvisualizestheworkflowthatwefollowedinourreviewonthe aque-oushumorproteomeofprimaryopenangleglaucomapatients.Inshort,aliteraturesearchwas performedtofindeligiblestudies.Wesubsequentlytriedtoobtaintherawdatarelatedtothese studieseitherviapublicly availablerepositories orby attemptingtocontactthecorresponding author.Threedatasetswereobtained(seedataaccessibilitytablefortherespectivelinks).Each datasetwasreanalysed andprocessed,afterwhichtheywere combinedinto1datasetfor sta-tisticalanalysis.

(4)

Fig. 1. Flowchart reporting the workflow to obtain a combined proteomic dataset of glaucomatous aqueous humor. File1:

File 1 is a description of the patient characteristics. The columns are self-explanatory. Humphrey visual field analysertest results (columnJ andK) were not available forsome pa-tients asindicated by“NA”. Thesampleshighlightedinred wereexcluded fromourcombined analysis, because thesepatientswere additionallydiagnosed withpseudoexfoliationsyndrome (PEX).The remainingcontrolsandglaucomapatientswerepooledtoformacombineddataset ofwhichtheaverageage,genderdistributionandaverageeyepressureisprovidedincolumns Q-S.Statisticalanalysis(columnT)showedthattheseparameterswerenotsignificantlydifferent betweenthetwogroups.

File2(general):

WereanalysedthethreedatasetswithMaxQuantandexportedtheoutputfilestoMicrosoft Excel(file2).Thedataare namedafterthecorrespondingfirst authors.Thesefilesare consid-eredasrawdata,i.e.theyareunprocessedandcontainseveralredundantcolumns.Thegeneral layout isasfollows:possibleidentified proteins(A),proteinwithmostpeptidereads(B),how manytimesapeptidewasmeasured(C-E),theproteinnames(F),genesymbol(G),fastaheader (H),peptide read per sample, molecular weightof the protein,peptide identificationmethod, sequencecoverage,uncorrectedintensity,IBAQcorrectionintensity,LFQcorrectedintensityand MS/MScount.

Foreach file, thesamples were differentlyannotatedby the authors.An overviewisgiven below:

File2dataset1:Adav.

This dataset contains 5 control patients (CG065, CG070,CG072, CG075 and CG078) and5 glaucoma patients (G009, G010, G016, G039 and G041). Aqueous humor of each patient was analysedinduplo(_Aand_B).

File2dataset2:Kaur.

Thisdatasetcontains9controlpatientsand9glaucomapatients.Controlgroupwasdenoted asCatandglaucomagroupasPOAG.Itseemsthisstudywasperformedintwobatches.Thefirst

(5)

batchof4controland4POAGpatientswasannotatedas“long” (Cat1URlong,Cat2long,Cat 3long,Cat4long,POAG1long,POAG2long,POAG3longadPOAG4long)andwasperformed induplo(long1vslong2).Onesamplewasalsoanalysedathirdtime(cat1UR)presumableto testadifferentprotocol.Thesecondbatch of5control(New Cat1–5)and5POAG(NewPOAG 1–5)werenotmeasuredinduplo.

File2dataset3:Kliuchnikova.

Thisdataset contains11control patients(k10, k14, k18, k24,k32, k44, k52,k60, k62, k64, k8)and7glaucomapatients(g110,g114,g116,g12,g50,g54,g56).Allpatientswereanalysedin triplo(_1,_2and_3).

Processeddatasets(file3andfile4):

Proteinexpressionsofduplicatesampleswereaveraged.Theaveraged intensity,iBAQ inten-sityandLFQintensityforeachdatasetareprovidedinfile3.Thisfilecontainsthreetabsnamed “Adav_duplo removed”, “Kaur_duplo removed” and “Kliu_duplo removed”. Layout and sample codingisthesameasforfile2.UsingVLOOKUPfunctionofMicrosoftExcelandVenndiagram softwareallreportedproteinsacrossstudieswerematchedintoasinglefile(file4).Wepresent theproteins(A),majorityproteinUniProtID(B),proteinname(C),genename(D),fastaheader (E),inhow manysamplesthe proteinisidentifiedwithin eachgroup andstudy(G-L), the av-erageLFQexpressionineach study(N-P)andshowedthatafternormalizationtheaverageLFQ intensity wasthe same in each study(column S-U). The normalized LFQ intensity per sam-ple/studyisreported(columnW-BP)andtherawLFQintensitiesispresentedincolumnBR-DK. Rawintensities(DR-FK)andiBAQnormalizedintensities(FP-HI)arealsoprovided.

Filtereddataset(file4):

Forthepurposeofourreview [1],thedatasetwasfurtheranalysedinPerseus.Weremoved contaminants,filteredonproteinswhoseLFQproteinexpressionwasdetectedinmorethan70% ofthesamplesinatleastonestudy,log-normalizedtheLFQintensitiesandperformedmultiple ANOVAtocompareglaucomaandcontrolpatients.TheoutcomewasagainexportedtoMicrosoft Excel(file4).Thefiltereddata fileconsistsofthefollowingcolumns: genename(A),majority proteinUniprotID(B),proteinname(C),meanexpressionincontrols(D),inhowmanycontrol samplestheproteinwasdetected(E),meanexpressioninglaucoma(F),inhowmanyglaucoma samplestheproteinwasdetected(G),anddifferenceinlogtransformedproteinexpression be-tweenglaucomaandcontrols(H).Theuncorrectedp-value(I)andtheFDR-correctedq-value(J) arereported.ColumnL-BEareproteinexpressionvaluesofeachindividualsample.

2. ExperimentalDesign,Materials,andMethods

As depictedin the flowchart (fig.e1), we performeda systematic literature search to find studiesthatreportedproteomicsdatafromLC-MS/MSstudiesofglaucomaaqueoushumor sam-ples. Keywords used were “primary open angle glaucoma” and “aqueous humor”. We found 9LC-MS/MSstudiesofwhich7studiesmatchedourcriteriathatother oculardiseasesare ab-sent [2–8].Weattemptedtogetaccesstotheunderlyingrawdataeitherviadepositoriesorby contactingthecorrespondingauthors.Wemanagedtoobtaintherawdataofthreepublications [2–4](PXD007624,PXD002623andPXD004928).

Oftwoofthesepublications,thepatientcharacteristicswereunfortunatelynotwelldefined. Upon contacting thecorresponding authors,they kindly provided usthe missinginformation. We report the detailed patient characteristics in this manuscript (file 1). Since the inclusion andexclusioncriteriawerelargely overlapping betweenthe threestudies, wedecided topool the controlsandto pool the glaucoma patients fora combined analysis.Patients additionally diagnosedwithpseudoexfoliationsyndromewereexcludedfromthiscombineddataset.Asseen fromcolumnsQ-T,thepooledgroup of25controlsand21glaucomapatientswerecomparable intermsofage,genderdistributionandeyepressure.

Therawdataofprimary openangleglaucoma patientsandcontrolswere reanalyzedusing MAXQuantsoftware(Max PlanckInstitute;[9,10]).Astherawdatavaried greatlybetweenthe

(6)

studies,wefailedtonormalizethedatainapooledreanalysis.Therefore,wedecidedto reana-lyzeeachstudyseparately.Thefollowingsettingswereused:

• Variablemodification:Oxidation(M)andAcetylation(proteinN-term) • Fixedmodification:Carbamidomethyl(C)

• Trypsindigestion

◦ Maxmissedcleavage:2 • Labelfreequantification

◦ Minimumratiocount:2 ◦ FastLFQmodeenabled, ◦ StabilizelargeLFQratios

◦ Minnumberofneighbours:3;averagenumberofneighbours:6 • Peptideidentification:

◦ “fromandto”

◦ Advancedidentificationenabled  Secondpeptides

 Matchbetweenruns

Outputfileswere exported toMicrosoftExcel(file2). Sampleor runduplicates were com-binedtoobtainproteinexpressionvaluesperindividual (file3).Wedidthisaccordingthedata processing recommendations of Bijlsma et al [11]. This meant that samples were averaged if more than one sample had LFQ expression values. If only one of the duplicate samples had measured expressionvalues,thissamplewasconsidered astheaverage.Forproteinsofwhich noneofthereplicates hadexpressionvalues,thevalue wassetto0.Tocombinethedatasets, weextractedthelistofmajorityproteinID’sfromeachstudy.Incaseofmultiplemajority pro-tein IDs matchingto a peptidesequence, we separatedthem intodifferent columns.This en-abled usto check ifatleast one ofthesuggested proteins wasreportedin theother studies, ensuringthehighestamountofoverlapbetweenthestudies.Weidentifiedtheoverlapviatwo different methodsi.e. theVLOOKUP function of MicrosoftExceland by usinga free Venn di-agramsoftware(VIB-Ugent; http://bioinformatics.psb.ugent.be/webtools/Venn/).After we estab-lishedwhatproteinshadoverlappingdetectionbetweenstudiesweusedtheVLOOKUPfunction tocopythecorrespondingexpressionvaluesofeachstudy,creatingourfinal combineddataset (file4).Forcombinedanalysisinthepublicationcorrespondingtothisdataset [1],weusedthe LFQintensitiesoftheproteins.LFQintensitiesvariedgreatlybetweenstudies(1000fold differ-ence) andneedednormalization.This wasachievedbydividing theLFQintensityof aprotein by theaverage LFQintensityintherespective studyandthenmultiplying by theaverage LFQ intensityacrossallstudies.Researchers canapplyother normalizationmethods onthisdataset forintensity,iBAQintensityandLFQintensity.File4wassubsequentlyimportedtoafree analy-sissoftware(Perseus1.6.5.0;MaxPlanckInstitute) [12].Herewefilteredforproteinsthatwere notconsideredcontaminantsandweredetectedwithahighconfidence.Thismeantthatwithin a study,proteins were detected inatleast70% ofeitherthe controlpatients ortheglaucoma patients.Next,weperformedalog-transformationonthenormalizedLFQproteinexpression in-tensity dataandstatistically compared theexpression ofthe glaucoma group andthe control group using the build inmultiple comparison ANOVAwithFDR-adjusted correction. The out-comewasexportedbacktoMicrosoftExcel(file5).

3. EthicsStatement

The current study used data fromthree previously publisheddatasets on human aqueous humor proteome andwe did not havecontact withanyof thestudy participants.All studies declaredthattheyadheredtotheDeclarationofHelsinkiandperformedthestudieson partici-pantsthatprovidedwritteninformedconsent.

(7)

DeclarationofCompetingInterest

Theauthorsdeclarethattheyhavenoknowncompetingfinancialinterests orpersonal rela-tionshipswhichhave,orcouldbeperceivedtohave,influencedtheworkreportedinthisarticle.

Acknowledgements

Theauthorsthankprof. dr.RamanjitSihota,dr. InderjeetKaur,prof. dr.SergeiMoshkovskii, dr.AnnaKliuchnikovaandallcollaboratorsoftheincludeddatasets,formakingtheirdata pub-liclyavailableandprovidingadditionalinformationregardingtheirstudiesuponourrequest.

SupplementaryMaterials

Supplementary materialassociated withthisarticlecan be found,in theonlineversion, at doi:10.1016/j.dib.2020.106327.

References

[1] WHG Hubens, RJC Mohren, I Liesenborghs, LMT Eijssen, WD Ramdas, CAB Webers, TGMF Gorgels, The aqueous humor proteome of primary open angle glaucoma: an extensive review, Exp. Eye Res. 197 (2020) 108077 https: //doi.org/10.1016/j.exer.2020.108077 .

[2] SS Adav , J Wei , Y Terence , BC Ang , LW Yip , SK Sze , Proteomic analysis of aqueous humor from primary open an- gle glaucoma patients on drug treatment revealed altered complement activation cascade, J. Proteome. Res. 17 (7) (2018) 2499–2510 .

[3] I Kaur , J Kaur , K Sooraj , S Goswami , R Saxena , VS Chauhan , R Sihota , Comparative evaluation of the aqueous humor proteome of primary angle closure and primary open angle glaucomas and age-related cataract eyes, Int. Ophthal- mol. (2018) 1–36 .

[4] AA Kliuchnikova , NI Samokhina , IY Ilina , DS Karpov , MA Pyatnitskiy , KG Kuznetsova , IY Toropygin , SA Kochergin , IB Alekseev , VG Zgoda , et al. , Human aqueous humor proteome in cataract, glaucoma, and pseudoexfoliation syn- drome, Proteomics 16 (13) (2016) 1938–1946 .

[5] D Salamanca , JL Gomez-Chaparro , A Hidalgo , F Labella , Differential expression of proteome in aqueous humor in patients with and without glaucoma, Arch. Soc. Esp. Oftalmol. 93 (4) (2018) 160–168 .

[6] Y Ji , X Rong , H Ye , K Zhang , Y Lu , Proteomic analysis of aqueous humor proteins associated with cataract develop- ment, Clin. Biochem. 48 (18) (2015) 1304–1309 .

[7] MA Kaeslin , HE Killer , CA Fuhrer , N Zeleny , AR Huber , A Neutzner , Changes to the aqueous humor proteome during glaucoma, PLoS One 11 (10) (2016) e0165314 .

[8] S Sharma , KE Bollinger , SK Kodeboyina , W Zhi , J Patton , S Bai , B Edwards , L Ulrich , D Bogorad , A Sharma ,Proteomic alterations in aqueous humor from patients with primary open angle glaucoma, Invest. Ophthalmol. Vis. Sci. 59 (6) (2018) 2635–2643 .

[9] J Ox , M Mann , MaxQuant enables high peptide identification rates, individualized p.p.b.-range mass accuracies and proteome-wide protein quantification, Nat. Biotechnol. 26 (2008) 1367–1372 .

[10] S Tyanova , T Temu , J Cox , The MaxQuant computational platform for mass spectrometry-based shotgun proteomics, Nat. Protocols 11 (2016) 2301–2319 .

[11] S Bijlsma , I Bobeldijk , ER Verheij , R Ramaker , S Kochhar , IA Macdonald , B van Ommen , AK Smilde , Large-scale human metabolomics studies: a strategy for data (pre-) processing and validation, Anal. Chem. 78 (2006) 567–574 . [12] S Tyanova , T Temu , P Sinitcyn , A Carlson , MY Hein , T Geiger , M Mann , J Cox , The Perseus computational platform

Referenties

GERELATEERDE DOCUMENTEN

galacto-oligosaccharides, long-chain fructo-oligosaccharides and Bifidobacterium breve M-16V; AD, atopic dermatitis; AMC, Academic Medical Centre, Amsterdam, the Netherlands; CCL,

Novel preclinical models, therapies and biomarkers for testicular cancer Rosas Plaza,

In conclusion, young children after liver transplantation have similar MVPA patterns, spend less time on sedentary activities compared to published healthy norms, and have

We compared the predicted 10-year CVD mortality as calculated using the SCORE high-risk and low-risk algorithms with the observed 10-year CVD mortality in the European

The most recent ESC guidelines on cardiovascular disease (CVD) prevention suggest that there is a fixed relationship between CVD mortality and the total burden of CVD events,

1) Fitnessen is goed voor mijn lichamelijke gesteldheid en conditie.. ‘Sporten en bewegen doen we samen: Liam Bisschop - 2403765 De customer experience van fitnessbeoefenaars’

§2 Argumenten tegen Korsgaard Het eerste probleem binnen de theorie van Korsgaard welke door Street wordt aangetoond, is de circulariteit die schuilgaat achter de

Extra aandacht hierbij is altijd nodig voor de eigen bewoners en bedrijven in de regio: niet alleen zijn zij de ambassadeurs van de regio (een kans voor gebruik van sociale