1. In what situations would you choose to use CASE WHEN over other conditional constructs like IF or COALESCE?
Answer: CASE WHEN is useful when dealing with multiple conditions and categorizing the data. When dealing with NULLs, It provides a cleaner and more readable solution compared to nested IF statements or COALESCE.
2. What is the significance of the ELSE
clause in a CASE WHEN
statement?
Answer: If none of the preceding conditions is true, the ELSE
clause provides a default result.
3. Provide a real-world scenario where using Multiple CASE WHEN
statements would be beneficial.
Answer: In retail cases, we can use multiple CASE WHEN statements to categorize products based on rating, sales, and profit.
4. Explain the concept of nesting in SQL. How and when would you use nested CASE WHEN
statements?
Answer: Nesting involves placing one CASE WHEN statement inside another. This can be used when conditions depend on the outcome of prior conditions, creating a hierarchy of logic.
5. Provide an example where nesting CASE WHEN
statements are necessary for a more complex condition.
Answer: In a grading system, you might nest CASE WHEN statements to categorize students as ‘Excellent,’ ‘Good,’ ‘Satisfactory,’ or ‘Needs Improvement’ based on both grade and participation.
6. How does the CASE WHEN statement handle NULL values in conditions?
Answer: CASE WHEN handles NULL values by evaluating conditions as false when dealing with NULL. COALESCE function is used to handle NULL values explicitly.
7. Discuss potential performance considerations when using Multiple CASE WHEN
statements.
Answer: Multiple conditions may impact query performance. Indexing columns involved in conditions and simplifying logic can optimize performance.
8. How would you optimize a query involving multiple nested CASE WHEN
statements for better performance?
Answer: Regularly review and optimize the query, ensure proper indexing, and simplify complex logic for improved performance.
9. Imagine a scenario where the classification criteria for products based on sales need to be adjusted dynamically. How would you implement this using CASE WHEN
?
Answer: By introducing variables or parameters in the CASE WHEN conditions, allowing for dynamic adjustments based on changing business requirements.
10. Consider a situation where some data points are missing (NULL
values). How would you handle this when using Multiple CASE WHEN
statements?
Answer: I would use the COALESCE function to handle NULL values and ensure that the conditions are explicitly defined for such scenarios.
11. Discuss potential pitfalls or challenges when working with complex conditions in a CASE WHEN
statement.
Answer: Pitfalls include overcomplicating queries, overlooking specific conditions, and potentially impacting query readability. Careful consideration is needed to balance complexity and clarity.
12. Compare and contrast the CASE WHEN
statement with the IF statement in SQL.
Answer: Unlike the IF statement, CASE WHEN is SQL’s standard conditional construct and provides a more readable and flexible solution for handling multiple conditions.
13. In what scenarios would you prefer using a CASE WHEN
statement over using a JOIN clause?
Answer: While a JOIN clause is used to combine data from multiple tables, CASE WHEN is used for conditional logic within a single table. I would use CASE WHEN for categorization and JOIN for combining related data.
14. Write a query to identify customers who placed more than three transactions each in both 2019 and 2020.
Write a query to identify customers who placed more than three transactions each in both 2019 and 2020.
Example:
Input:
transactions
table
Column | Type |
---|---|
id | INTEGER |
user_id | INTEGER |
created_at | DATETIME |
product_id | INTEGER |
quantity | INTEGER |
users
table
Column | Type |
---|---|
id | INTEGER |
name | VARCHAR |
Output:
Column | Type |
---|---|
customer_name | VARCHAR |
15. Given a table exam_scores containing the data about all of the exams that students took, form a new table to track the scores for each student.
To finish a class, students must pass four exams (exam ids: 1,2,3 and 4).
Given a table exam_scores
containing the data about all of the exams that students took, form a new table to track the scores for each student.
Note: Students took each exam only once.
Example:
For the given input:
student_id | student_name | exam_id | score |
---|---|---|---|
100 | Anna | 1 | 71 |
100 | Anna | 2 | 72 |
100 | Anna | 3 | 73 |
100 | Anna | 4 | 74 |
101 | Brian | 1 | 65 |
the expected output should be:
student_name | exam_1 | exam_2 | exam_3 | exam_4 |
---|---|---|---|---|
Anna | 71 | 72 | 73 | 74 |
Brian | 65 | NULL | NULL | NULL |
Input:
exam_scores
table
Column | Type |
---|---|
student_id | INTEGER |
student_name | VARCHAR |
exam_id | INTEGER |
score | INTEGER |
Output:
Column | Type |
---|---|
student_name | VARCHAR |
exam_1 | INT |
exam_2 | INT |
exam_3 | INT |
exam_4 | INT |