I can see where you are going with sort, as you can reliably know when you have hit a new word and keep track of counts for each unique word. However, what you really want to do is use a hash [dictionary] to keep track of the counts as dictionary keys are unique. For example:
words = sentence.split[]
counts = {}
for word in words:
if word not in counts:
counts[word] = 0
counts[word] += 1
Now that will give you a dictionary where the key is the word and the value is the number of times it appears. There are things you can do like using collections.defaultdict[int]
so you can just add the
value:
counts = collections.defaultdict[int]
for word in words:
counts[word] += 1
But there is even something better than that... collections.Counter
which will take your list of words and turn it into a dictionary [an extension of dictionary actually] containing the counts.
counts = collections.Counter[words]
From there you want the list of words in sorted order with their counts so you can print them. items[]
will give you a list of tuples, and sorted
will sort [by default] by the first item of each tuple [the word in this case]... which is exactly what you want.
import collections
sentence = """As far as the laws of mathematics refer to reality they are not certain as far as they are certain they do not refer to reality"""
words = sentence.split[]
word_counts = collections.Counter[words]
for word, count in sorted[word_counts.items[]]:
print['"%s" is repeated %d time%s.' % [word, count, "s" if count > 1 else ""]]
OUTPUT
"As" is repeated 1 time.
"are" is repeated 2 times.
"as" is repeated 3 times.
"certain" is repeated 2 times.
"do" is repeated 1 time.
"far" is repeated 2 times.
"laws" is repeated 1 time.
"mathematics" is repeated 1 time.
"not" is repeated 2 times.
"of" is repeated 1 time.
"reality" is repeated 2 times.
"refer" is repeated 2 times.
"the" is repeated 1 time.
"they" is repeated 3 times.
"to" is repeated 2 times.
View Discussion
Improve Article
Save Article
View Discussion
Improve Article
Save Article
Prerequisite : Dictionary data structure Given a string, Find the 1st repeated word in a string. Examples:
Input : "Ravi had been saying that he had been there" Output : had Input : "Ravi had been saying that" Output : No Repetition Input : "he had had he" Output : he
We have existing solution for this problem please refer Find the first repeated word in a string link. We can solve this problem quickly in python using Dictionary data structure. Approach is simple,
- First split given string separated by space.
- Now convert list of words into dictionary using collections.Counter[iterator] method. Dictionary contains words as key and it’s frequency as value.
- Now traverse list of words again and check which first word has frequency greater than 1.
Python3
from
collections
import
Counter
def
firstRepeat[
input
]:
words
=
input
.split[
' '
]
dict
=
Counter[words]
for
key
in
words:
if
dict
[key]>
1
:
print
[key]
return
if
__name__
=
=
"__main__":
input
=
'Ravi had been saying that he had been there'
firstRepeat[
input
]
Output:
had
Time Complexity: O[length[words]]
Auxiliary Space: O[length[dict]]
Explanation
In this program, we need to find out the duplicate words present in the string and display those words.
To find the duplicate words from the string, we first split the string into words. We count the occurrence of each word in the string. If count is greater than 1, it implies that a word has duplicate in the string.
In above example, the words highlighted in green are duplicate words.
Algorithm
- Define a string.
- Convert the string into lowercase to make the comparison insensitive.
- Split the string into words.
- Two loops will be used to find duplicate words. Outer loop will select a word and Initialize variable count to 1. Inner loop will compare the word selected by outer loop with rest of the words.
- If a match found, then increment the count by 1 and set the duplicates of word to '0' to avoid counting it again.
- After the inner loop, if count of a word is greater than 1 which signifies that the word has duplicates in the string.
Solution
Python
Output:
Duplicate words in a given string : big black
C
Output:
Duplicate words in a given string : big black
JAVA
Output:
Duplicate words in a given string : big black
C#
Output:
Duplicate words in a given string : big Black
PHP
Output:
Duplicate words in a given string : big black
Next Topic#