Curating Data Readings

Public Deck • Created by DataCurator

30 cards
Card 1
Term
Critical Data Studies
Definition
A field that examines the cultural, ethical, and political dimensions of data. It rejects the view of data as neutral and instead investigates how data is generated, curated, and exerts power within sociotechnical systems.
Card 2
Term
Data Assemblage
Definition
The complex, interconnected sociotechnical system that produces and gives meaning to data. It includes thought, knowledge, finance, politics, materiality, practices, organizations, and laws.
Card 3
Term
Raw Data is an Oxymoron
Definition
The concept that data is never truly raw or untouched. It is always cooked/shaped, framed, and interpreted by human choices, tools, and disciplinary norms before it can function as data.
Card 4
Term
Technological Determinism
Definition
The flawed belief that technology (like Big Data) autonomously drives social change. CDS argues that technology is a product of society that in turn shapes it.
Card 5
Term
Apophenia
Definition
The human tendency to perceive meaningful patterns or connections in random or meaningless data. A significant risk when analyzing large datasets (Check: spurious correlations website).
Card 6
Term
Big Data
Definition
Datasets that are too large or complex to be processed by traditional software. Characterized by volume, velocity, and variety, and generated continuously.
Card 7
Term
Small Data
Definition
Data that is only generated infrequently (e.g., a national census every 10 years) using inflexible methods.
Card 8
Term
Data Holding
Definition
Informal, often personal storage of data (e.g., backups, personal files). Lacks metadata, standards, and long-term preservation planning.
Card 9
Term
Data Archive
Definition
A formal, curated, and documented collection of data intended for long-term preservation and reuse. Includes data, metadata, and context, and is managed by specialists.
Card 10
Term
Trusted Digital Repository (TDR)
Definition
A certified digital repository that ensures long-term access to data, complying with standards like the Open Archival Information System (OAIS) model.
Card 11
Term
Cyber-infrastructure
Definition
Large-scale, standardized, and interoperable data infrastructures that are cross-institutional (e.g., for genomics or climate data).
Card 12
Term
Data Friction
Definition
The costs, resistances, and challenges involved in the collection, integration, and sharing of data.
Card 13
Term
DIKW Hierarchy
Definition
A model representing a hierarchy of understanding: Data → Information → Knowledge → Wisdom. Each higher level adds context, meaning, and value.
Card 14
Term
Data (in DIKW)
Definition
Discrete, objective symbols or facts without context or meaning (e.g., the number 72).
Card 15
Term
Information
Definition
Data that has been processed and organized to be meaningful and relevant for a purpose (e.g., The temperature is 72°F).
Card 16
Term
Knowledge
Definition
Information that has been understood and internalized through experience and context. It is actionable information that answers how questions.
Card 17
Term
Wisdom
Definition
The evaluated understanding of knowledge. It involves judgment, ethics, values, and understanding long-term consequences. It answers why and cannot be automated.
Card 18
Term
Efficiency
Definition
Doing things right; the use of resources relative to an objective. Can be automated.
Card 19
Term
Effectiveness
Definition
Doing the right things; efficiency multiplied by value. Requires human wisdom and judgment.
Card 20
Term
Curating (Data)
Definition
The practice of selecting, managing, and preserving data. It is not a neutral act but a practice of knowledge creation and power.
Card 21
Term
Posthuman Curating
Definition
A concept where curating is no longer performed solely by humans but is a distributed process involving nonhuman agents like algorithms, software, and platforms.
Card 22
Term
Content Curation
Definition
The mundane, massive-scale practice of selecting and managing online content (e.g., reblogging on Tumblr). Users perform curation while simultaneously becoming data for algorithms.
Card 23
Term
The Curatorial
Definition
A philosophy or distinct field of discourse and thought about the practice and theory of curating.
Card 24
Term
Metadata
Definition
Data about data. It provides crucial context for other data, such as how, when, and by whom it was created. It is not neutral and enacts particular worldviews.
Card 25
Term
Quantification
Definition
The process of turning qualities into quantities. It is not a neutral, descriptive act but a situated, creative, and agential practice deeply entangled with power and world-making.
Card 26
Term
Surveillance Capitalism
Definition
An economic system centered on the commodification of personal data for profit and behavioral prediction and modification.
Card 27
Term
Synthetic Data
Definition
Artificially generated data that mimics the statistical properties of real-world data. Used to address privacy concerns or data scarcity but raises questions about fidelity.
Card 28
Term
Data Justice
Definition
An ethical framework concerned with fairness in the way data is used, highlighting how data-driven systems can reinforce existing inequalities and power structures.
Card 29
Term
Metadata Justice
Definition
The ongoing struggle to resist harmful classifications and update biased metadata standards (e.g., changing illegal aliens to undocumented immigrants in library systems).
Card 30
Term
Digital Divide (in Big Data)
Definition
The inequality in access to large-scale data resources, creating a divide between Big Data rich institutions (e.g., corporations, well-funded universities) and Big Data poor ones.