University of Pretoria
Browse

File(s) under embargo

Reason: Publication pending

1

year(s)

10

month(s)

18

day(s)

until file(s) become available

Shrinkage estimator evaluation of multicollinear big data

dataset
posted on 2024-06-12, 14:21 authored by Salomi du PlessisSalomi du Plessis, Mohammad Arashi, Gaonyalelwe MaribeGaonyalelwe Maribe

Two datasets were considered in the application of the techniques proposed to address multicollinearity in the high-dimensional and big data domain. These are:

  1. Alon.csv: a cancer classification dataset. The dataset contains 2000 gene expression levels on 62 colon tissues of which 40 are cancerous.
  2. Airport.csv: data obtained from the U.S. Bureau of Transportation Statistics, which provides information on U.S. transportation systems. The dataset contains information such as the origin and destination airport of various flights, the time at which flights are scheduled to arrive and depart, the delay in departing and arriving flights, and the time and distance to destination airports for major air carriers.


History

Department/Unit

Statistics

Sustainable Development Goals

  • 3 Good Health and Well-Being
  • 10 Reduced Inequalities

Usage metrics

    Natural and Agricultural Sciences

    Licence

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC