Quick Answer
SQL is the #1 skill demanded in data analyst job listings in India in 2026 — appearing in over 85% of postings. A focused learner can master the essentials in 4–8 weeks. Priority concepts: JOINs, GROUP BY, window functions, CTEs, and subqueries.
5 Things to Know About SQL for Data Analysts in 2026
- SQL appears in 85%+ of data analyst job descriptions in India — it is the single most demanded skill.
- Window functions (ROW_NUMBER, RANK, LAG/LEAD) are the most tested advanced SQL topic in interviews.
- SQL is now used with AI tools — ChatGPT and Copilot can generate SQL, but you still need to understand it to validate the output.
- SQL integrates directly with Power BI, Python (Pandas), Excel Power Query, and cloud platforms.
- Learning SQL does NOT require a Computer Science degree — anyone can learn it in 4–8 weeks with focused practice.
Why SQL Is Still King in 2026
Every few years someone declares that SQL is dying. In 2026, it is more alive than ever.
Here is the reality: all data lives in tables. Whether it is a MySQL database, a PostgreSQL server, a Snowflake data warehouse, a Databricks lakehouse, or a Google BigQuery project — the way you query structured data is SQL.
AI tools like ChatGPT, Copilot, and Gemini can now generate SQL queries from plain English. But you still need to:
- Understand if the generated SQL is correct
- Debug it when it returns wrong results
- Optimize it when it runs slowly
- Understand the business logic embedded in the query
SQL literacy is more important than ever — not despite AI, but because of it.
"SQL skills are growing in demand, not shrinking. Every new data platform — Snowflake, Databricks, BigQuery — is built around SQL as the primary query language." — 2026 Data Analytics Job Market Analysis
The 10 SQL Concepts Every Data Analyst Must Know
| # | Concept | Why It Matters | Interview Frequency |
|---|---|---|---|
| 1 | SELECT, WHERE, ORDER BY | Foundation of all queries | Always |
| 2 | GROUP BY + Aggregate Functions | Summarizing data | Always |
| 3 | JOINs (INNER, LEFT, RIGHT, FULL) | Combining data from multiple tables | Always |
| 4 | HAVING clause | Filtering aggregated results | Very High |
| 5 | Subqueries | Nested logic for complex analyses | High |
| 6 | Window Functions | Ranking, running totals, period-over-period | Very High |
| 7 | CTEs (WITH clause) | Readable, reusable query logic | High |
| 8 | Date Functions | Time-based analysis is everywhere | High |
| 9 | CASE WHEN | Conditional logic and categorization | High |
| 10 | String Functions | Text data cleaning and parsing | Medium |
JOINs Explained with Real Examples
JOINs are tested in every single data analyst interview. You cannot avoid them. Let me show you each type with a real-world example.
The Setup: Orders and Customers Tables
-- Customers Table
CustomerID | CustomerName | City
1 | Ravi Kumar | Salem
2 | Priya Sharma | Chennai
3 | Arun Raj | Coimbatore
-- Orders Table
OrderID | CustomerID | Amount | Status
101 | 1 | 5000 | Completed
102 | 2 | 3200 | Pending
103 | 4 | 1800 | Completed -- CustomerID 4 doesn't exist
INNER JOIN — Only Matching Rows
SELECT c.CustomerName, o.OrderID, o.Amount
FROM Customers c
INNER JOIN Orders o ON c.CustomerID = o.CustomerID;
-- Result: Ravi Kumar (₹5000), Priya Sharma (₹3200)
-- Arun Raj excluded (no orders), Order 103 excluded (no matching customer)
LEFT JOIN — All Customers, Even Those Without Orders
SELECT c.CustomerName, o.OrderID, o.Amount
FROM Customers c
LEFT JOIN Orders o ON c.CustomerID = o.CustomerID;
-- Result: Ravi, Priya, AND Arun (with NULL for order fields)
-- Use this when you want to find customers who have NOT ordered: WHERE o.OrderID IS NULL
A Classic Interview Question
-- "Find all customers who have never placed an order"
SELECT c.CustomerName
FROM Customers c
LEFT JOIN Orders o ON c.CustomerID = o.CustomerID
WHERE o.OrderID IS NULL;
-- Result: Arun Raj (no matching order)
Window Functions: The Interview Difference-Maker
Window functions are the most tested advanced SQL topic in Indian data analyst interviews. Most candidates can do basic SELECT and GROUP BY. Fewer can write window functions. That gap is your opportunity.
The Core Window Functions
-- ROW_NUMBER: Assign unique rank to each row within a partition
SELECT
ProductName,
Region,
Revenue,
ROW_NUMBER() OVER (PARTITION BY Region ORDER BY Revenue DESC) AS rank_in_region
FROM Sales;
-- RANK: Like ROW_NUMBER but gives same rank to ties
SELECT
SalesPersonName,
TotalSales,
RANK() OVER (ORDER BY TotalSales DESC) AS sales_rank
FROM SalesSummary;
-- LAG/LEAD: Compare current row to previous or next row
SELECT
Month,
Revenue,
LAG(Revenue, 1) OVER (ORDER BY Month) AS prev_month_revenue,
Revenue - LAG(Revenue, 1) OVER (ORDER BY Month) AS month_over_month_change
FROM MonthlySales;
-- Running Total with SUM OVER
SELECT
TransactionDate,
Amount,
SUM(Amount) OVER (ORDER BY TransactionDate) AS cumulative_total
FROM Transactions;
Classic Window Function Interview Question
-- "Find the top 3 products by revenue in each category"
WITH ranked_products AS (
SELECT
Category,
ProductName,
Revenue,
RANK() OVER (PARTITION BY Category ORDER BY Revenue DESC) AS rnk
FROM Products
)
SELECT Category, ProductName, Revenue
FROM ranked_products
WHERE rnk <= 3;
CTEs and Subqueries for Complex Analysis
CTEs (Common Table Expressions) make complex SQL readable. Instead of nested subqueries that are hard to debug, CTEs let you break the logic into named, reusable steps.
-- Business question: "Which customers spent more than the average order value
-- and placed more than 3 orders in the last 6 months?"
-- Old way (nested subqueries — hard to read):
SELECT CustomerID
FROM Orders
WHERE CustomerID IN (
SELECT CustomerID FROM Orders
WHERE OrderDate >= DATEADD(month, -6, GETDATE())
GROUP BY CustomerID HAVING COUNT(*) > 3
)
AND Amount > (SELECT AVG(Amount) FROM Orders);
-- Better way with CTEs:
WITH recent_orders AS (
SELECT CustomerID, Amount
FROM Orders
WHERE OrderDate >= DATEADD(month, -6, GETDATE())
),
customer_summary AS (
SELECT
CustomerID,
COUNT(*) AS order_count,
AVG(Amount) AS avg_order_value
FROM recent_orders
GROUP BY CustomerID
),
overall_avg AS (
SELECT AVG(Amount) AS global_avg FROM Orders
)
SELECT cs.CustomerID
FROM customer_summary cs, overall_avg oa
WHERE cs.order_count > 3
AND cs.avg_order_value > oa.global_avg;
The CTE version is dramatically more readable and easier to debug. Senior analysts write CTEs, not nested subqueries.
SQL + Power BI + Python: How They Work Together
SQL with Power BI
In Power BI, you can use SQL to pull data from your database. The workflow:
- Connect Power BI to your SQL Server, MySQL, or PostgreSQL database
- Write a native SQL query to pull exactly the data you need
- Or use DirectQuery mode for live connections to large databases
- Power BI's DAX language for measures borrows concepts from SQL
-- Example: Native SQL query in Power BI data source
SELECT
p.ProductName,
c.CategoryName,
SUM(oi.Quantity * oi.UnitPrice) AS TotalRevenue,
MONTH(o.OrderDate) AS Month,
YEAR(o.OrderDate) AS Year
FROM Orders o
JOIN OrderItems oi ON o.OrderID = oi.OrderID
JOIN Products p ON oi.ProductID = p.ProductID
JOIN Categories c ON p.CategoryID = c.CategoryID
WHERE o.OrderDate >= '2025-01-01'
GROUP BY p.ProductName, c.CategoryName, MONTH(o.OrderDate), YEAR(o.OrderDate)
ORDER BY Year, Month, TotalRevenue DESC;
SQL with Python (Pandas)
import pandas as pd
import sqlalchemy
# Connect to database
engine = sqlalchemy.create_engine('postgresql://username:password@host:5432/dbname')
# Read SQL directly into Pandas DataFrame
query = """
SELECT customer_id, SUM(amount) as total_spend
FROM orders
WHERE status = 'completed'
GROUP BY customer_id
ORDER BY total_spend DESC
LIMIT 100
"""
df = pd.read_sql(query, engine)
# Continue analysis in Python
print(df.head(10))
print(f"Average customer spend: ₹{df['total_spend'].mean():,.0f}")
15 Real SQL Interview Questions for Indian Companies
These questions are from actual data analyst interviews at Indian companies and IT firms.
- "Write a query to find the second highest salary in the employees table." (Window functions or subquery)
- "Find all customers who ordered in January 2026 but not in February 2026." (NOT IN / EXCEPT)
- "Calculate the month-over-month revenue growth percentage." (LAG window function)
- "Find the top 3 products by sales in each region." (RANK with PARTITION BY)
- "Find departments where the average salary is higher than the company average." (HAVING + subquery)
- "Write a query to detect duplicate records in a table." (GROUP BY + HAVING COUNT > 1)
- "Find customers who have placed orders every month for the past 6 months." (COUNT DISTINCT on months)
- "What is the difference between DELETE, DROP, and TRUNCATE?" (Conceptual)
- "Write a self-join to find employees who earn more than their manager." (Self JOIN)
- "Find the running total of sales day by day for this quarter." (SUM OVER ORDER BY)
- "Explain the difference between INNER JOIN and LEFT JOIN with an example." (Conceptual + code)
- "Write a query to pivot monthly sales data into columns for each month." (CASE WHEN pivot technique)
- "Find all orders where the delivery date was more than 7 days after the order date." (Date functions)
- "Calculate the 7-day moving average of page views." (Window function with frame clause)
- "Write a CTE that identifies customers at risk of churn (no orders in the last 90 days)." (CTE + date functions)
7 SQL Mistakes Beginners Make
- Using SELECT * everywhere. Always specify the columns you need. SELECT * is slow on large tables and makes your query harder to understand.
- Forgetting NULL handling. NULL is not zero. IS NULL and IS NOT NULL are different from = NULL (which never works).
- Confusing WHERE and HAVING. WHERE filters before aggregation; HAVING filters after. Mixing them up gives wrong results.
- Not understanding JOIN types. Using INNER JOIN when you needed LEFT JOIN loses rows silently — causing reports to show incorrect totals.
- Writing subqueries instead of CTEs. Three levels of nested subqueries become impossible to maintain. Learn CTEs early.
- Ignoring indexes. Understanding why some queries are fast and others are slow requires knowing how indexes work.
- Not practicing on real datasets. Textbook examples are not enough. Practice on Kaggle datasets or real business data.
SQL Learning Path: 8-Week Plan
| Week | Topics | Practice Goal |
|---|---|---|
| 1 | SELECT, WHERE, ORDER BY, LIMIT | 50 basic queries |
| 2 | GROUP BY, Aggregate Functions, HAVING | 30 aggregation problems |
| 3 | INNER JOIN, LEFT JOIN | 20 join problems |
| 4 | RIGHT JOIN, FULL JOIN, Self JOIN | 15 advanced join problems |
| 5 | Subqueries, Correlated Subqueries | 20 nested query problems |
| 6 | CTEs, CASE WHEN, String Functions | 20 complex logic problems |
| 7 | Window Functions — ROW_NUMBER, RANK, LAG/LEAD, SUM OVER | 25 window function problems |
| 8 | Date Functions, Query Optimization, Mock Interview | Full mock interview + 2 projects |
Resources to practice:
- Kaggle — free datasets and SQL notebooks
- SQLZoo — interactive SQL practice
- LeetCode SQL problems — 50 essential questions
- HackerRank SQL challenges — beginner to advanced
Frequently Asked Questions
Is SQL still important for data analysts in 2026?
Yes. SQL appears in 85%+ of data analyst job descriptions in India. It is the most demanded technical skill.
Which SQL should I learn — MySQL, PostgreSQL, or SQL Server?
Learn ANSI SQL fundamentals first — 90% of syntax is the same. Then specialize based on your target industry.
How long does it take to learn SQL?
4–8 weeks for fundamentals. An additional 4–8 weeks for advanced topics like window functions and query optimization.
What SQL topics do interviewers test?
JOINs, GROUP BY, window functions (especially ROW_NUMBER/LAG/LEAD), CTEs, subqueries, and date functions.
Can I use SQL with Power BI?
Yes. Power BI connects directly to SQL databases. You can write native SQL queries in Power BI's import step.
Where can I learn SQL in Salem?
Linkskill Academy offers a structured SQL course in Salem with hands-on projects and interview preparation.
What is the difference between WHERE and HAVING?
WHERE filters rows before aggregation. HAVING filters groups after aggregation. WHERE revenue > 1000 vs HAVING SUM(revenue) > 10000.
Do I need SQL to use Python for data analysis?
Not strictly, but SQL knowledge helps enormously — most data sources used with Python come from SQL databases.
Learn SQL the Right Way at Linkskill Academy, Salem
Our structured SQL Course in Salem takes you from zero to interview-ready in 8 weeks, with real datasets, hands-on practice, and mock interview sessions.