Data Warehouse, Data Mart, and BI Tools | IT Passport Exam Prep
A structured overview of data warehouses (DWH), data marts, BI tools, ETL, and their differences from data lakes, tailored for the IT Passport exam.
The Big Picture of Data Utilization
Data accumulated daily in operational systems (OLTP) is extracted, transformed, and loaded through ETL processing, then consolidated into a data warehouse (DWH). Afterward, it is visualized and analyzed using BI tools to support business decision-making. This entire sequence forms the big picture of data utilization.
Data Warehouse (DWH)
A data warehouse (DWH) is a database specialized for analytical purposes, storing historical data from multiple operational systems over long periods. Its key feature is that data is essentially appended and never overwritten. Notable services include Snowflake, Amazon Redshift, and Google BigQuery.
Data Mart
A data mart is a subset of a DWH, shaped for a specific department or analytical goal. Examples include sales marts, marketing marts, and finance marts. Data marts are smaller than a DWH and offer the advantage of faster response times.
Data Lake
A data lake is a system that stores raw data as-is, regardless of whether it is structured or unstructured. Data is processed and analyzed later as needed. It is well-suited for big data and AI training data storage; details are covered in NoSQL and Big Data.
ETL
ETL stands for Extract, Transform, and Load—a processing pipeline used to migrate data from operational systems to a DWH. In recent years, the ELT approach (Load first, then Transform) has also become widespread.
BI Tools
BI (Business Intelligence) tools are software that analyze and visualize accumulated data to support business decision-making. Representative tools include Tableau, Microsoft Power BI, Looker, and Google Data Studio. They offer features such as dashboards, reports, pivot analysis, and drill-downs.
OLAP (Multidimensional Analysis)
OLAP (Online Analytical Processing) is a technology for performing multidimensional analysis on a DWH. It uses operations like slice, dice, drill-down, and roll-up to extract data from various perspectives.
Key Points for the IT Passport Exam
The exam frequently covers DWH characteristics (append-only, long-term retention, analytical use), the three steps of ETL, the purpose of BI tools, and the differences between data lakes and DWHs. Mastering these topics will help you secure points.
Typical Past Exam Question Patterns
- "Which process transfers data from operational systems to a DWH?" → ETL
- "Which tool supports data analysis and visualization?" → BI
Related Terms
- NoSQL and Big Data (NoSQL and Big Data)
- AI and Machine Learning (AI and Machine Learning Basics)
- DX (What is DX?)
Study Tips
It is important to memorize the big picture of data utilization as the flow: "Operations → ETL → DWH → BI." Clearly distinguish between a DWH (processed data) and a data lake (raw data). Also, understanding the ETL acronym (Extract, Transform, Load) in order will make exam questions easier to handle.
Summary
By connecting the characteristics of a DWH, the ETL process, and the role of BI tools, you can reliably score points on database questions in the Technology domain. For comprehensive practice in the Technology area, check out the Technology Summary, and for full-length exam simulations, use the Mock Exam.
関連記事
What Is 5G? Differences from 4G and Use Cases for the IT Passport Exam
Organizes the three main features of 5G (high speed, low latency, massive connectivity), differences from 4G, and applications in autonomous driving and remote medicine for the IT Passport exam.
How to Use ABC Analysis and Pareto Charts for the IT Passport Exam
A breakdown of ABC analysis (priority analysis) procedures, how to read Pareto charts, and the 80:20 rule as covered in the IT Passport exam.
AI and Machine Learning Basics | Key IT Passport Exam Terminology
Organizes AI-related terms tested on the IT Passport exam, including the relationship between AI, machine learning, and deep learning, differences between supervised/unsupervised/reinforcement learning, and generative AI and LLMs.