This database is described in
Please cite this publication when referencing this material, and also include the standard citation for PhysioNet:
The Long-Term ST Database contains 86 lengthy ECG recordings of 80 human subjects, chosen to exhibit a variety of events of ST segment changes, including ischemic ST episodes, axis-related non-ischemic ST episodes, episodes of slow ST level drift, and episodes containing mixtures of these phenomena. The database was created to support development and evaluation of algorithms capable of accurate differentiation of ischemic and non-ischemic ST events, as well as basic research into mechanisms and dynamics of myocardial ischemia.
Half (43) of these 86 recordings, representing 42 of the 80 subjects, were contributed to PhysioNet by the creators of the database in February 2003, and the remaining half of the database was contributed in May 2007. (A corrected version of s30801.dat was also posted together with the second half of the database.) Detailed clinical notes and ST deviation trend plots are provided for all 86 records. The entire Long-Term ST Database is also available from its original home page at the Laboratory for Biomedical Computer Systems and Imaging at the University of Ljubljana, Slovenia.
The individual recordings of the Long-Term ST Database are between 21 and 24 hours in duration, and contain two or three ECG signals. Each ECG signal has been digitized at 250 samples per second with 12-bit resolution over a range of ±10 millivolts. Each record includes a set of meticulously verified ST episode and signal quality annotations, together with additional beat-by-beat QRS annotations and ST level measurements.
For each recording, the first digit in the record name (2 or 3) indicates the number of ECG signals. Records obtained from the same subject have names that differ in the last digit only.
Each record is represented by 12 files, all with the same base name (the record name) and a suffix that identifies the file type:
- a (text)
.hea(header) file, containing detailed clinical information for the subject;
- a (binary)
.dat(signal) file, containing the digitized ECG signals;
- several (binary) annotation files, identifiable by suffix:
.ari(automatically-generated beat annotations)
.atr(manually corrected beat annotations)
.16a(automatically-generated, manually-corrected ST-segment measurements, based on 16-second moving averages, updated for each beat)
.sta(ST-segment episode annotations, Vmin = 75 µV, Tmin = 30 s; see below)
.stb(ST-segment episode annotations, Vmin = 100 µV, Tmin = 30 s)
.stc(ST-segment episode annotations, Vmin = 100 µV, Tmin = 60 s)
- a (text)
.cntfile, containing counts of ST episodes in the
- a (text)
.stffile, containing the ST level function, the linearly approximated baseline ST level function (ST level reference function), and the ST deviation function for each ECG lead;
- a (binary)
.tsr.zipfile, containing additional data files needed by SEMIA (see below):
- a (text)
_fin.dmyfile, containing fine diagnostic and morphology time series
- a (text)
_raw.dmyfile, containing raw diagnostic and morphology time series
- a (text)
_1.stafile, containing ST segment markers
.tsr.zipfile for the record(s) of interest into your current directory.)
- a (text)
- a (binary)
.klt.zipfile, which decompresses to a (text)
- time series of ST segment Karhunen-Loève transform coefficients (ST segment principal components)
- time series of QRS complex Karhunen-Loève transform coefficients (QRS complex principal components)
The measurements in the
.16a files were used to construct ST level and
deviation functions for each signal, as recorded in the
(Further details about the
.klt.zip files are available here.) ST
episodes were identified independently for each signal, based on its ST
deviation function and on these criteria:
- An episode begins when the magnitude of the ST deviation function first exceeds 50 µV;
- The deviation must reach a magnitude of Vmin or more throughout a continuous interval of at least Tmin;
- The episode ends when the deviation becomes smaller than 50 µV, provided that it does not exceed 50 µV in the following 30 seconds.
Since differing criteria may be appropriate depending on the application,
three sets of ST episode annotations are provided. The annotation codes
used in the
files are described here.
For each record, the numbers of ST episodes as determined by each of the three
sets of criteria are summarized in an additional text file (with suffix
.cnt). The deviation functions and the locations of the episodes are
presented graphically in a set of trend plots here. Each
record is represented by a 24-hour plot (
_00-24.png) and by five
6-hour plots which overlap by one hour (
Development of the Long-Term ST Database was an inter-institutional and international effort coordinated by Prof. Franc Jager of the Faculty of Computer and Information Science, University of Ljubljana, Ljubljana, Slovenia. Other investigators include: Roman Dorn, PhD, and Ales Smrdel, MSc, of the Faculty of Computer and Information Science, Ljubljana; Dr. Gorazd Antolic of the University Medical Center, Ljubljana, Slovenia; Drs. Alessandro Taddei and Michele Emdin of the CNR Institute for Clinical Physiology (the creators of the European ST-T Database European ST-T Database), Pisa, and Prof. Carlo Marchesi of the University of Firenze, Firenze, Italy; and Dr. Roger Mark and George Moody of the Massachusetts Institute of Technology (the creators of the MIT-BIH Arrhythmia Database), Cambridge, MA, USA, and the Beth Israel Deaconess Medical Center, Boston, MA, USA. The project was supported by Medtronic, Inc. (Minneapolis, MN, USA) and Zymed, Inc. (Camarillo, CA, USA). Development of the Long-Term ST Database began in 1995 and was completed in 2002. We thank all who contributed to this project; further details are here.
Several sources contributed recordings to the Long-Term ST Database:
- Eleven of the recordings included in the Long-Term ST Database are from the initial Long-Term ST Database developed under a joint U.S.-Slovenian research project between 1995 and 1998.
- Ten additional recordings of the Long-Term ST Database are from the collection originally gathered by the Pisa group for the European ST-T Database, which contains two-hour excerpts of some of these same recordings. The original analog recordings were redigitized for the Long-Term ST Database; since the signals have been rescaled as a result, direct comparison of the annotations in the European ST-T Database records with those for the corresponding portions of the Long-Term ST Database records is not possible. The inclusion of these recordings in the Long-Term ST Database allows study of the dynamics of ischemic ST changes over a much longer period in these previously well-studied subjects. Among the samples available here, record s20021 includes the two-hour segment that was previously digitized to produce record e0113 of the European ST-T Database.
- Another 18 of the LTSTDB recordings, those containing recordings with three ECG signals, were contributed to the project by Zymed, Inc.
The annotation of the Long-Term ST Database was performed using SEMIA, a program written by the group in Ljubljana for this purpose. SEMIA provides an interactive graphical user interface to a semi-automated algorithm for measurement of ST levels. Sources for SEMIA, and a precompiled version for GNU/Linux, are available here (as individual files), and as a gzip-compressed tar archive.
Each recording was reviewed independently by expert annotators using SEMIA at each of the three sites (Ljubljana, Pisa, and Cambridge). Participants met several times annually to obtain the consensus reference annotations.
A series of SEMIA screenshots illustrates the annotation process. (Use your browser's Back button to return to this page after following the links to these screenshots in the next paragraph. If you have problems viewing the screenshots in your browser, please read this note.)
The first task faced by the expert annotators was to mark the locations of the PQ junction (the isoelectric level) and the J point, based on 16-second averaged cardiac cycles chosen at frequent intervals throughout the recordings. These marks serve as guideposts for the automated ST level measurement algorithm that performs the next step. The experts then examine the time series of ST level measurements in order to locate and to mark a set of local reference points (marked as LR in the upper panel of the figure). These are used to construct a piecewise linear baseline ST level function, which may vary over time as a result of body position changes or other factors unrelated to ischemia, especially in subjects with prior myocardial infarctions. Axis shifts reflect body position changes, and are usually most apparent in the QRS complexes (note the changes in the QRS principal components, KL1 - KL5, in the lower panel of the figure). By contrast, when ischemic ST changes occur, they are most apparent in the principal components of the ST segment (see the lower panel in this screenshot). Local references are placed before and after each such episode, and the episodes are annotated next. During this process, the expert annotators have the option of viewing either the ST level time series or the ST deviation time series (formed by subtracting the baseline ST level function from the uncorrected ST level time series), as shown in the upper panels of the two screenshots. For further details, see reference 4 below.
Software for producing printed documentation of the Long-Term ST Database is available for Linux or Unix. The software produces compact trend plots of the ST level and ST deviation time series, with indicators of ischemic and non-ischemic ST episodes.
Franc Jager and Miha Amon have contributed additional sets of time series computed from the ST segments of each normal and non-noisy beat in the database. In each case, they provided time series computed separately for each ECG lead.
- In 2009, Miha and Franc calculated coefficients of normalized and non-normalized Legendre orthonormal basis functions. The Legendre orthonormal-transform coefficient time series are in the legendre subdirectory.
- In 2011, Miha and Franc derived new single-lead KL basis functions for the ST segments, and used them to compute normalized and non-normalized KL coefficients. The KL calculated coefficients are centralized by their mean values. The single-lead KL coefficient time series are in the kl-single subdirectory.
- In 2015, Miha and Franc derived another new set of single lead KL basis functions for the ST segments, and their subsequent normalized and non-normalized KL coefficients. This time, the KL coefficients are not mean-centered. The single-lead KL coefficient time series are in the kl-single-uncentralized subdirectory.
Derivation of the Legendre orthonormal-transform normalized and non-normalized coefficient time series, derivation of new single-lead KL basis functions for the ST segments, and derivation of normalized and non-normalized KL coefficient time series is described in reference 5 below.
The kl-single and kl-single-uncentralized projects use different techniques (time domain and KL based respectively) to remove noisy heartbeats. Therefore the KL-Transform is applied on two different covariance matrices derived from two different sets of ST sections, which results in two slightly different sets of basis functions. More importantly however is that only the subsequent kl-single coefficients are centralized by their mean values.
For further information, please contact:Franc Jager
Laboratory of Biomedical Computer Systems and Imaging
University of Ljubljana
Faculty of Computer and Information Science
1000 Ljubljana, Slovenia
National Research Council (CNR)
Institute of Clinical Physiology
Via Moruzzi 1
56124 Pisa, Italy
George B. Moody
Harvard-MIT Division of Health Sciences and Technology
Massachusetts Institute of Technology, Room E25-505A
77 Massachusetts Avenue
Cambridge, MA 02139 USA
- Franc Jager, George B. Moody, Alessandro Taddei, Gorazd Antolic, Mitja Zabukovec, Maja Skrjanc, Michele Emdin, and Roger G. Mark. Development of a Long-Term Database for Assessing the Performance of Transient Ischemia Detectors. Computers in Cardiology 1996, pp. 481-484, IEEE Press. ISSN 0276-6547. [HTML] [LaTeX] [PostScript] [PDF]
- Franc Jager, George B. Moody, Alessandro Taddei, Gorazd Antolic, Ales Smrdel, Boris Glavic, Michele Emdin, Carlo Marchesi, and Roger G. Mark. Research Resources for Development and Evaluation of Transient Ischemia Detectors. Proc. Computer-Aided Data Analysis in Medicine (CADAM 98), Informatica Medica Slovenica, 5(1,2):45-54, 1998. ISSN 1318-2129.
- Franc Jager, George B. Moody, Alessandro Taddei, Gorazd Antolic, Michele Emdin, Ales Smrdel, Boris Glavic, Carlo Marchesi, and Roger G. Mark. A Long-Term ST Database for Development and Evaluation of Ischemia Detectors. Computers in Cardiology 1998, pp. 301-304, IEEE Press. ISSN 0276-6547.
- Franc Jager, Alessandro Taddei, Michele Emdin, Gorazd Antolic, Roman Dorn, George B. Moody, Boris Glavic, Ales Smrdel, M Varanini, Mitja Zabukovec, Simone Bordigiago, Carlo Marchesi, and Roger G. Mark. The Long-Term ST Database: A Research Resource for Algorithm Development and Physiologic Studies of Transient Myocardial Ischemia. Computers in Cardiology 2000, pp. 841-844. [HTML] [LaTeX] [PostScript] [PDF]
- Miha Amon. Robustno ocenjevanje oblik elektrokardiograma z uporabo ortogonalnih transformacij. [Robust estimation of morphologic features and shape representation of electrocardiograms using orthogonal transforms; in Slovene, includes English abstract.] MSc Thesis, 2011, Faculty of Computer and Information Science, University of Ljubljana, Slovenia. [PDF]
- Miha Amon, Franc Jager. Electrocardiogram ST-Segment Morphology Delineation Method Using Orthogonal Transformations. PLOS One, February 10, 2016. DOI: 10.1371/journal.pone.0148814.