4) Technical and professional questions (what separates prepared from average)
This is where US interviews get real. A BI Developer can’t hide behind “I’m more of a visual person.” Expect SQL depth, modeling clarity, and tool-specific decisions—especially if the role is close to a Power BI Developer or Tableau Developer track.
Q: How do you design a star schema for a sales analytics use case? What’s the grain?
Why they ask it: They’re testing whether you understand dimensional modeling and can prevent metric drift.
Answer framework: Grain-first modeling. State grain, facts, dimensions, keys, and slowly changing dimensions.
Example answer: “I start by defining grain—say, one row per order line per day. Then I build a fact table with additive measures like revenue, quantity, discount, and foreign keys to dimensions like customer, product, date, and sales rep. I keep dimensions conformed so ‘customer’ means the same across subject areas, and I plan for SCD Type 2 where attributes like customer segment change over time. That structure makes measures consistent and keeps BI tools fast.”
Common mistake: Describing tables without stating grain, which is how you end up with double-counting.
Q: Write a SQL approach to find duplicate records and explain how you’d fix them upstream.
Why they ask it: They want practical SQL plus data engineering instincts.
Answer framework: Detect–Diagnose–Prevent. Show window functions, root cause, and a prevention mechanism.
Example answer: “I’d detect duplicates using a window function like row_number() over (partition by business_key order by updated_at desc) and filter where row_number > 1. Then I’d diagnose whether it’s a join explosion, late-arriving events, or a missing unique constraint. Fix-wise, I prefer upstream prevention: enforce uniqueness in the transformation layer, add a dbt uniqueness test, and if needed implement idempotent loads so reruns don’t duplicate data.”
Common mistake: Only talking about deleting duplicates manually, which doesn’t scale.
Q: In Power BI, how do you choose between a calculated column, a measure, and Power Query transformations?
Why they ask it: They’re testing whether you understand performance, refresh cost, and model design as a Power BI Developer.
Answer framework: Compute location decision. Explain when to compute at refresh vs query time and why.
Example answer: “If it’s a row-level attribute that should be stored and reused—like a normalized category—I’ll do it in Power Query so it’s computed at refresh and keeps the model clean. If it’s an aggregation that must respond to filters—like revenue YTD—I’ll use a DAX measure. Calculated columns are my last resort when I need a column for slicing but can’t do it upstream; they can bloat the model and hurt performance.”
Common mistake: Using calculated columns for everything because it feels easier than DAX.
Q: How do you optimize a slow Power BI report?
Why they ask it: They want a real troubleshooting playbook, not “I’d add an index.”
Answer framework: Measure–Isolate–Fix–Verify. Mention Performance Analyzer, DAX tuning, and model changes.
Example answer: “First I use Performance Analyzer to see whether the bottleneck is visuals, DAX, or the data source. Then I isolate expensive measures—often iterators like SUMX over large tables—and rewrite them using more efficient patterns. I also check model design: reduce cardinality, hide unused columns, and consider aggregations or incremental refresh. Finally, I verify improvements with before/after timings and confirm results didn’t change.”
Common mistake: Tweaking visuals randomly without measuring where time is actually spent.
Q: In Tableau, how do you handle row-level security and performance for large datasets?
Why they ask it: They’re checking Tableau Developer maturity: security patterns plus extract/live tradeoffs.
Answer framework: Security + performance pairing. Explain RLS approach and how you keep dashboards responsive.
Example answer: “For row-level security, I typically use user filters tied to a security table mapping users to allowed entities, or I enforce it at the database layer with views when possible. For performance, I decide between extracts and live connections based on freshness needs and query load. I also optimize by limiting high-cardinality quick filters, using context filters carefully, and designing extracts with only needed fields.”
Common mistake: Relying on workbook-level hacks for security instead of a maintainable security model.
Q: Explain how you’d build a semantic layer so Finance and Sales stop getting different answers.
Why they ask it: They want to see if you can create governed metrics, not just reports.
Answer framework: Single definition, multiple surfaces. Define where logic lives, how it’s versioned, and how it’s documented.
Example answer: “I’d centralize metric logic in a semantic layer—either in the BI model or in a transformation layer—so ‘gross margin’ isn’t reimplemented in five dashboards. I’d version the definitions, add documentation and examples, and set up a certification process for ‘official’ datasets. Then I’d migrate key dashboards to the certified model and deprecate the old ones with a clear timeline.”
Common mistake: Trying to solve metric inconsistency by sending a Slack message with definitions.
Q: What’s your approach to incremental loads and late-arriving data for BI reporting?
Why they ask it: They’re testing whether you can keep data fresh without breaking historical accuracy.
Answer framework: Freshness–Correctness–Cost triangle. Explain how you balance them.
Example answer: “I define freshness requirements per dataset—some need hourly, others daily. For incremental loads, I use watermarking on event time or updated time, but I also plan for late-arriving data with a lookback window, like reprocessing the last 7 days. I validate with reconciliation checks and monitor drift. That keeps costs down while maintaining correctness for reporting.”
Common mistake: Assuming event time is always reliable and never planning for late updates.
Q: How do you test BI logic so you don’t ship broken metrics?
Why they ask it: They want engineering discipline applied to analytics.
Answer framework: Layered testing. Unit tests for transformations, reconciliation for aggregates, and UAT with stakeholders.
Example answer: “I test at multiple layers: schema and uniqueness tests on transformed tables, business rule tests like ‘refunds can’t exceed revenue,’ and reconciliation queries against known totals. For dashboards, I do a numbers walkthrough with a stakeholder using a fixed filter set so we can confirm expected outputs. I also document assumptions so future changes don’t silently break logic.”
Common mistake: Treating dashboard QA as ‘looks good to me’ instead of validating numbers.
Q: If you’re working with healthcare or finance data, what US regulations or standards affect your BI work?
Why they ask it: They’re checking whether you understand compliance impacts on access, logging, and data handling in the US.
Answer framework: Regulation → control → BI implication. Name the rule, the control, and what you do differently.
Example answer: “In healthcare, HIPAA affects how PHI is accessed, logged, and shared, so I design datasets with minimum necessary fields and enforce role-based access. In finance or public companies, SOX influences controls around reporting changes, so I’m careful about change management, approvals, and audit trails for metric definitions. Practically, that means certified datasets, documented transformations, and clear access reviews.”
Common mistake: Saying “I’m not responsible for compliance” and ignoring how BI can leak sensitive data.
Q: A dashboard shows different totals than the CFO’s spreadsheet. How do you debug it?
Why they ask it: This is the real job: reconciling truth under pressure.
Answer framework: Reconcile by slicing. Align definitions, time windows, filters, and grain step-by-step.
Example answer: “First I align definitions: what exactly is included in the CFO’s total—recognized vs booked, net vs gross, currency, and time zone. Then I compare at the lowest common grain, like transactions by day, and identify where the divergence starts. I check filters, joins, and deduplication logic, and I confirm whether the spreadsheet includes manual adjustments. Once we find the cause, I document it and update either the BI logic or the stakeholder guidance so it doesn’t repeat.”
Common mistake: Arguing that the dashboard is right before you’ve aligned definitions and grain.
Q: What would you do if the BI tool or refresh pipeline fails right before an executive meeting?
Why they ask it: They’re testing incident response, stakeholder communication, and backup planning.
Answer framework: Triage–Communicate–Fallback–Fix–Prevent.
Example answer: “I’d triage quickly: is it a gateway issue, credential expiry, source outage, or capacity problem? In parallel, I’d communicate a clear status and ETA to the meeting owner—no vague ‘looking into it.’ If needed, I’d use a fallback like a cached export from the last successful refresh with a timestamp and a caveat. After the meeting, I’d fix root cause and add monitoring/alerts so the failure is caught earlier next time.”
Common mistake: Going silent while you troubleshoot, leaving stakeholders to discover the failure live.