Constraints Do Matter!

himanshu
4 min readJan 7, 2019

--

Recently I came across an interesting problem in an online coding round. This question is a perfect example of, that if we look at the constraints carefully we can come up with simple and efficient solutions more quickly in online assessments. So let’s have a look at the problem and how to approach it correctly.

Problem Statement

You are given ‘N’ words in a dictionary such that each word has a length ‘K’ and consists of only lowercase alphabets(‘a’, ‘b’, ….., ‘z’). You will be given ‘Q’ queries where each query consists of a word of length ‘K’ and with some unspecified letters represented by ‘?’ .
we have to write a program that counts the number of words in the dictionary which has the same letters in the specified positions.

Input Format

First Line: Two space separated integers ‘N’ and ‘K’.
Next ‘N’ lines: each contains a string
Next Line: An integer Q
Next ‘Q’ lines: each contains query String.

Now Let’s have a look at a example.

5 3
man
can
ran
pin
sin
2
?a?
si?

Output Format:
3
1

Constraints:
N ≤ 5*10⁴
K ≤ 7
Q ≤ 10⁵

Let’s understand this example. The first query we are given is ?a? which means the first and the third letter can be replaced with any alphabet. Therefore the words that match this particular criteria are:
man, can & ran constituting a total of ‘3’ . Similarly in the second query the only dictionary word that matches the criteria is sin and hence the answer is ‘1’.

Solution
Now we know the problem. Let’s dive straight into the solution. First let’s talk about the brute force approach in which we try to match every dictionary word with the query and increment the count.Let’s talk about the space and the Time Complexity of this solution.

Although the above solution is easy to come-up with it will straight ahead give a TLE(Time Limit Error). Let’s see why!
For each query we are checking all the words in the dictionary each of length K. Therefore the total time complexity for all the queries is O(Q*N*K) which is pretty slow.

Can we do something better. Let’s take a look at the constraints again. The maximum length of the word can be 7.
Voila!!! There’s the trick we can generate all the possible queries that can be made with the given dictionary words and hash their counts. Let’s see why this solution works and is faster compared to previous one.
Suppose we have a dictionary word “pens”. The only way a particular query matches this word is if it has ‘?’ on some or all of the places

Query generation.

The total queries that can be made with this word is 2*(length of the word) because at every position we have two choices either we have that particular letter or it is replaced with question mark. Therefore the maximum queries that can be made in worst case is 2⁷ =128. Also we have a maximum of 5*10⁴ words therefore the total query words that can be generated is (128*5*10⁴) in worst case.

Hmm. I was coming to it by the way. Let’s see the query step. Now we have count of every possible query that matches any dictionary word already. We can simply answer the query in a constant time. Let’s take a look at the code for better understanding.

In the code above the generateMatchableQueries() function is used to generate all the queries that are able to match this word. We do the same thing for every word in the dictionary. Once we have generated all the queries that could match, along with their count. we can simply answer any query in O(1) time.
Let’s analyse the time complexity of the algorithm. The generation step takes O(128*N) = O(N) in worst case. The query step takes O(1) time for a single query. Therefore the total complexity is O(N) + O(Q). Thus we reduced the overall complexity from a significant factor.
You must be wondering that why the constraints matter in this algorithm. It’s because if the max length would have been more than 7 somewhere ≥ 10. this algorithm would not have been worked either. Then we might have to use tries.

Wrapping It Up
Whenever a problem is given in an online coding assessment always try to take advantage of the constraints because it doesn’t make any sense to write a Quicksort when the constraints are as low as 100. A BubbleSort would be as good and easy to code.

Sign up to discover human stories that deepen your understanding of the world.

Free

Distraction-free reading. No ads.

Organize your knowledge with lists and highlights.

Tell your story. Find your audience.

Membership

Read member-only stories

Support writers you read most

Earn money for your writing

Listen to audio narrations

Read offline with the Medium app

--

--

himanshu
himanshu

Written by himanshu

Software developer and tech enthusiast.

No responses yet

Write a response