SQL
Lab
LinkedIn Easy Group By Duplicates

Duplicate
Job
Listings

Count how many companies have posted duplicate job listings, where duplicate listings share the same company, title, and description.

Table Schema

Inspect
Table

interactive
ColumnTypeDescription
job_idintegerUnique job posting identifier.
company_idintegerIdentifier for the company posting the job.
titlestringJob listing title.
descriptionstringJob listing description.
Duplicate CheckReason
company_idDuplicates must be within the same company.
titleThe title must match exactly.
descriptionThe description must match exactly.

Sample Data

Input
Output

Sample Input: job_listings
job_idcompany_idtitledescription
248827Business AnalystBusiness analyst evaluates past and current business data with the primary goal of improving decision-making processes within organizations.
149845Business AnalystBusiness analyst evaluates past and current business data with the primary goal of improving decision-making processes within organizations.
945345Data AnalystData analyst reviews data to identify key insights into a business's customers and ways the data can be used to solve problems.
164345Data AnalystData analyst reviews data to identify key insights into a business's customers and ways the data can be used to solve problems.
172244Data EngineerData engineer works in a variety of settings to build systems that collect, manage, and convert raw data into usable information.
Expected Output
duplicate_companies
1

SQL Editor

Run
Query

postgresql
Waiting for query

duplicate_companies
1

Hints

Unlock
Clues

Hint 01: Duplicates are not based on job_id. Group by company_id, title, and description.
Hint 02: A duplicate listing group has more than one row, so use HAVING COUNT(*) > 1.
Hint 03: Count distinct company_id values from the duplicate groups so each company is counted once.

Solution

Locked
Answer

Solution is locked until you decide to reveal it. Try the editor first, then open this when you want the reference answer.

SELECT
  COUNT(DISTINCT company_id) AS duplicate_companies
FROM (
  SELECT
    company_id,
    title,
    description
  FROM job_listings
  GROUP BY company_id, title, description
  HAVING COUNT(*) > 1
) AS duplicate_jobs;

Explanation

Step By
Step

01

Group rows by company_id, title, and description because all three fields define a duplicate job listing group.

02

Use HAVING COUNT(*) > 1 to keep only groups where the same company posted the same title and description multiple times.

03

Count distinct company_id values from those duplicate groups and alias the result as duplicate_companies.

Related Questions

Keep
Solving