The world of data (and much of its terminology) can be a confusing, scary place. Thus, we are over-excited to present the official Standard Co Data 101 Glossary, yet another tool to help create a better Data Experience for all. Think we missed a term? Drop us a line here!
Analytics - The uncovering of patterns, trends, consistencies, and outliers in a given set of recorded information by way of mathematics, statistics, and/or computer programming; a process in which Standard Data makes one’s life so much easier.
Artificial Intelligence (also AI) – The creation and subsequent usage of computing machines and processes that can simulate aspects of human intelligence, often used to “learn” patterns and detect outliers in a given set of recorded information; contrary to popular belief, AI isn’t a precursor to the downfall of civilization, but a tool that helps data-lovers do more with data (for now).
Big Data - This refers to the vast amounts of structured and unstructured data that can come from a plethora of sources incorporating the “three Vs of data”: volume, variety and velocity.
BigTable (also Bigtable) - BigTable is a fully managed wide-column and key-value NoSQL database service for large analytical and operational workloads as part of the Google Cloud portfolio; in plain speak, it is a distributed, column-oriented data store created by Google to handle enormous amounts of structured data for web services like Search and Google Maps.
Biometrics - The statistical analysis of human characteristics, both physiological and behavioral; think: fingerprints, iris patterns, facial features, and/or sleep patterns, body fat, blood-pressure; a measurement of all the things that can be used to identify other terrestrial beings.
Choropleth — AKA, not a heat map; a powerful type of data visualization that quickly communicates how “intense” a given region is, charting themes via intensity of color to correspond with an aggregate summary of a geographic characteristic within spatial units, such as population density or per-capita income... Sound confusing? Read more about cloropleths vs. heat maps here.
Cold Storage (also Cold Data)- Basically, this term refers to data that has been collected in the past, but is rarely used; think: archived data. To optimize storage costs, cold data can be securely stored on lower performing and less expensive storage media. ****BTW: cold storage is an available feature of Standard Data!
Dashboard - An information management tool (a la Metabase, Tableau, etc.) used to track, analyze, and display key performance indicators, metrics, and data points. A dynamic, real-time data visualization tool that is fully customizable to show top line success metrics and/or trends. Standard Data has an auto-dashboard feature (eloquently titled Standard Dashboard) but if organizational needs are more complex, Team Standard Co builds professional grade dashboards in no time.
Data - AKA, the thing we love more than all other things. Data, in essence, is information ready and willing to be processed and analyzed; facts and statistics collected together for reference.
Data Center - This is a facility containing a large number of networked computers used for storing, processing, and distributing large amounts of data. It houses IT equipment such as servers, routers and firewalls, as well as necessary infrastructure for the building such as power supplies, backup generators and ventilation systems. Think: Fort Knox, but with supercomputers instead of gold bullion.
Data Experience - AKA, the Standard Co difference. ****Data Experience, a la User Experience or Customer Experience, explains a company or organization’s relationship with, and implementation of, data. Data helps organizations of all sizes to make better decisions, faster. Standard Co help organizations understand, and implement, a better Data Experience.
Data Lake - A vast pool (hence, lake) of raw data without a pre-defined purpose, typically collected at large for later use once a strategy and/or specific need for said data has been identified. Comprised of both structured and unstructured data, typical data lakes are gathering raw insights, just waiting for their time to shine as a more functional component of the Data Experience.
Data Mart - A subset of the data warehouse, a data mart serves to store data used by a particular group within a company, such as the sales team. While a central archive serves the organization at large, the data marts target a specific need or purpose.
Data Mining- The process of deep diving into raw data for the purpose of unearthing insights and analyzing results. Better data mining means better decision-making, period. This requires complex database software (a la Microsoft SQL Server) to form predictive analytics.
Data Scientist - An interdisciplinary profession that deploys scientific methods, processes, algorithms, and data systems to extract knowledge and insights from structured and unstructured data. Data Scientists apply this knowledge to help leaders make better decisions, faster. Differing from a ****Data Analyst in that a Data Scientist works on new systems for capturing and analyzing data, whereas the former synthesizes existing data to help make sense of it.
Data Storage Devices - Typically, these are electromagnetic archives where information is electronically stored for short or longterm use. These can be removable and connected to the computer via an input/output setting, such as a USB thumb drive, external hard drive, or even antiquated physical media, like our old CD-R mixtape collections.
Data Warehouse - Not to be confused with a Data Lake, a Data Warehouse is a digital repository for structured, filtered data that has already been processed for a specific purpose. ****A hashing system may be used to make data easily searchable.
Encryption - The conversion of data ****into secret code that hides the information's true meaning, primarily for security purposes. Unencrypted data is commonly known as plaintext while encrypted data is called ciphertext.
Exabyte - The equivalent of 1024 petabytes; AKA an incomprehensible amount of recorded information. Big Data is often measured in exabytes.
Geospatial Data - A given set of information that describes objects, events or other features that specifically reference a location on or near the surface of Mother Earth, typically combining location information (usually global coordinates) and attribute information (the characteristics of the phenomena in question) with temporal information (the time or lifespan at which the location and attributes exist). Need an example? Check out our COVID Mapping Project.
Heat Map (also Heatmap, both are correct) - AKA not a cloropleth; a data visualization technique that shows magnitude of a given set of recorded information as color in two dimensions, be it via hue or intensity, to offer a non-data-lover visual cues as to how the data varies over a given area. Sound confusing? Read more about cloropleths vs. heat maps here.
Metadata - Simply put, this is ****Data that describes other data, used by This information is used by search engines to filter through documents and generate appropriate matches. Think: YouTube tagging, SEO tags, product description copy, headlines organization, etc.
Open Data - Data and information that is ****freely accessible for use, editing, and distribution — by anyone, anywhere, at any time. Team Standard Co loves a good side-hustle, so be sure to check out some of our Open Data Projects to give Standard Data a test drive.
Petabyte - The equivalent of 1 million gigabytes; that’s a whole lot of data, people.
Visualization / DataViz - AKA, the thing that gets us out of bed in the morning; broadly this is the use of a chart, diagram, picture, heat map, graph, trend line, cloropleth, etc. to visually represent information regardless of complexity. The best Data Visualizations are clean, streamlined, and purpose-driven, as defined by the Standard Co Data Experience principles.
Still lost? Let’s talk. Our team of real humans is standing by to help you do more with data. Our mission is to build a better Data Experience for all, and knowing the ins and outs of data terminology is just one step in helping decision-makers and non-data-lovers make better decisions, faster.