4) Technical and professional questions (the real separator)
This is where Irish BI interviews get concrete. You’ll be asked to explain tradeoffs, not just recite features. Expect follow-ups like “why that model?” and “what breaks if we change this?”
Q: Walk me through how you would model a star schema for sales analytics. What are your facts and dimensions?
Why they ask it: They want to see if you can design for performance, clarity, and correct aggregation.
Answer framework: “Grain first” framework: define grain → facts → dimensions → keys → slowly changing dimensions.
Example answer: “I start by defining the grain—say, one row per order line per day. The fact table would hold measures like net amount, quantity, discount, and foreign keys to dimensions. Dimensions would include customer, product, date, sales rep, and channel, with customer/product potentially as SCD Type 2 if attributes change. I’d also plan for conformed dimensions if multiple fact tables exist, like returns or web events.”
Common mistake: Listing tables without stating the grain, which is where most BI models go wrong.
Q: In SQL, how do you prevent double counting when joining fact tables to dimensions or other facts?
Why they ask it: Double counting is the classic BI failure mode.
Answer framework: “Cardinality check” answer: validate join keys, aggregate before join, use bridge tables when needed.
Example answer: “First I confirm the relationship cardinality—one-to-many should be safe, many-to-many is a red flag. If I need to combine facts, I aggregate each fact to a shared grain before joining, or I use a bridge table for many-to-many relationships like customers-to-segments. I also run reconciliation queries—row counts and sum checks—before trusting the result.”
Common mistake: Relying on DISTINCT as a band-aid instead of fixing the grain and joins.
Q: Explain the difference between an ETL and ELT approach, and when you’d choose each.
Why they ask it: They’re testing modern data stack thinking and cost/performance awareness.
Answer framework: Compare–Context–Decision: define both, tie to platform constraints, then choose.
Example answer: “ETL transforms before loading, which can help when the target system is limited or when you need strict control before data lands. ELT loads raw data first and transforms in the warehouse, which is common with cloud warehouses because compute is scalable and lineage is clearer. I choose ELT when we want reproducibility and fast iteration, and ETL when we have sensitive data constraints or a legacy target that can’t handle heavy transforms.”
Common mistake: Treating ELT as automatically ‘better’ without considering governance and cost.
Q: How do you design a semantic layer so a Business Intelligence Analyst can self-serve safely?
Why they ask it: They want scalable BI, not a ticket factory.
Answer framework: “Guardrails” answer: certified datasets, curated measures, role-based access, naming conventions.
Example answer: “I build a certified dataset with curated measures and consistent naming, and I hide raw columns that invite incorrect aggregation. I implement role-based access so analysts can explore within their permission scope without creating data leaks. Then I publish a KPI glossary and example reports so self-serve starts from a known-good base.”
Common mistake: Saying “self-serve means giving everyone raw tables.”
Q: For a Power BI Developer-style role: how do you decide between Import, DirectQuery, and a composite model?
Why they ask it: They’re testing performance, refresh strategy, and user experience.
Answer framework: Latency–Volume–Governance triad: decide based on freshness needs, data size, and model complexity.
Example answer: “If the business can tolerate scheduled refresh and the model fits, Import gives the best performance and DAX flexibility. DirectQuery is for near-real-time needs or very large datasets, but I’m careful about query folding and source load. Composite models can balance both—Import for dimensions and frequently used aggregates, DirectQuery for a large transactional fact—while keeping a consistent semantic layer.”
Common mistake: Choosing DirectQuery just because the dataset is big, then delivering a slow, fragile report.
Q: For a Tableau Developer-style role: how do you choose between extracts and live connections, and how do you optimize performance?
Why they ask it: They want someone who can keep dashboards fast under real usage.
Answer framework: “Performance chain” answer: source → model → extract strategy → workbook design.
Example answer: “I use extracts when performance and stability matter, especially if the source is shared and I don’t want to hammer it with live queries. Live connections can work for governed, performant sources, but I still design with aggregated views and limit high-cardinality filters. On the workbook side, I reduce marks, avoid unnecessary quick filters, and use context filters strategically.”
Common mistake: Blaming Tableau for slowness when the real issue is the data model or workbook design.
Q: How do you test BI logic—SQL transformations, measures, and dashboards—before release?
Why they ask it: They’re checking whether you can prevent embarrassing KPI incidents.
Answer framework: “Three test types” answer: unit tests for transforms, reconciliation tests, and stakeholder UAT with defined acceptance criteria.
Example answer: “For transformations, I use automated tests for nulls, uniqueness, and referential integrity, plus business rule tests like ‘net revenue can’t be negative unless refund.’ I reconcile totals against a trusted source like finance reports for a sample period. Then I run UAT with a checklist: key KPIs, filters, row-level security behavior, and refresh success.”
Common mistake: Only eyeballing charts and calling it testing.
Q: How do you handle row-level security (RLS) and least-privilege access in BI tools?
Why they ask it: In Ireland, GDPR expectations are real; access mistakes are high-risk.
Answer framework: Principle–Implementation–Audit: state least privilege, explain implementation, explain how you validate.
Example answer: “I start with least privilege and define roles based on business needs—region, department, client. In the BI layer, I implement RLS using user-to-entity mapping tables and keep logic centralized so it’s consistent across reports. I validate with test accounts and I document who can see what, because security that isn’t auditable isn’t real security.”
Common mistake: Relying on ‘workspace permissions’ alone and forgetting data-level restrictions.
Q: What GDPR considerations matter most for BI reporting in Ireland?
Why they ask it: They want to see if you understand privacy-by-design, not just dashboards.
Answer framework: “Data minimization” answer: purpose limitation, minimization, retention, and access controls.
Example answer: “For BI, I focus on purpose limitation—only collecting and reporting what’s needed for the business question—and data minimization, like avoiding unnecessary personal identifiers in datasets. I also care about retention and deletion workflows, especially if we’re building historical models. Finally, I ensure access controls and auditability, because a dashboard can become a data leak if it’s shared too widely.”
Common mistake: Treating GDPR as a legal team problem instead of a design constraint in your models.
Q: What would you do if the nightly refresh fails on the morning of an exec meeting?
Why they ask it: They’re testing incident response and stakeholder management under pressure.
Answer framework: Triage–Communicate–Recover–Prevent.
Example answer: “First I’d confirm scope: which datasets failed, what changed, and whether the last successful refresh is usable. I’d immediately message the exec sponsor with an ETA and a fallback—like using yesterday’s numbers clearly labeled—so no one is surprised in the meeting. Then I’d work the failure: check gateway/credentials, source availability, and recent schema changes, and rerun with a targeted refresh if possible. Afterward I’d document the root cause and add monitoring or schema drift checks to prevent repeats.”
Common mistake: Going silent while you troubleshoot, letting stakeholders discover the failure themselves.