Python regex remove duplicate words

I am very new a Python

I want to change sentence if there are repeated words.

Correct

Ex. "this just so so so nice" --> "this is just so nice"
Ex. "this is just is is" --> "this is just is"

Right now am I using this reg. but it do all so change on letters. Ex. "My friend and i is happy" --> "My friend and is happy" [it remove the "i" and space] ERROR

text = re.sub[r'[\w+]\1', r'\1', text] #remove duplicated words in row

How can I do the same change but instead of letters it have to check on words?

asked Jun 21, 2013 at 15:08

text = re.sub[r'\b[\w+][ \1\b]+', r'\1', text] #remove duplicated words in row

The \b matches the empty string, but only at the beginning or end of a word.

answered Jun 21, 2013 at 15:15

tomtom

20.4k6 gold badges40 silver badges36 bronze badges

Non- regex solution using itertools.groupby:

>>> strs = "this is just is is"
>>> from itertools import groupby
>>> " ".join[[k for k,v in groupby[strs.split[]]]]
'this is just is'
>>> strs = "this just so so so nice" 
>>> " ".join[[k for k,v in groupby[strs.split[]]]]
'this just so nice'

answered Jun 21, 2013 at 15:10

Ashwini ChaudharyAshwini Chaudhary

236k55 gold badges442 silver badges495 bronze badges

\b: Matches Word Boundaries
\w: Any word character

\1: Replaces the matches with the second word found

  import re


  def Remove_Duplicates[Test_string]:
      Pattern = r"\b[\w+][?:\W\1\b]+"
      return re.sub[Pattern, r"\1", Test_string, flags=re.IGNORECASE]


  Test_string1 = "Good bye bye world world"
  Test_string2 = "Ram went went to to his home"
  Test_string3 = "Hello hello world world"
  print[Remove_Duplicates[Test_string1]]
  print[Remove_Duplicates[Test_string2]]
  print[Remove_Duplicates[Test_string3]]

Result:

    Good bye world
    Ram went to his home
    Hello world

answered Feb 17, 2021 at 19:22

Not the answer you're looking for? Browse other questions tagged python regex or ask your own question.

View Discussion

Improve Article

Save Article

Read

Discuss

View Discussion

Improve Article

Save Article

Given a string str which represents a sentence, the task is to remove the duplicate words from sentences using regular expression in java.
Examples:

Input: str = “Good bye bye world world”
Output: Good bye world
Explanation:
We remove the second occurrence of bye and world from Good bye bye world world
Input: str = “Ram went went to to to his home”
Output: Ram went to his home
Explanation:
We remove the second occurrence of went and the second and third occurrences of to from Ram went went to to to his home.
Input: str = “Hello hello world world”
Output: Hello world
Explanation:
We remove the second occurrence of hello and world from Hello hello world world.

Approach

Get the sentence.
Form a regular expression to remove duplicate words from sentences.

regex = "\\b[\\w+][?:\\W+\\1\\b]+";

The details of the above regular expression can be understood as:
- “\\b”: A word boundary. Boundaries are needed for special cases. For example, in “My thesis is great”, “is” wont be matched twice.
- “\\w+” A word character: [a-zA-Z_0-9]
- “\\W+”: A non-word character: [^\w]
- “\\1”: Matches whatever was matched in the 1st group of parentheses, which in this case is the [\w+]
- “+”: Match whatever it’s placed after 1 or more times
Match the sentence with the Regex. In Java, this can be done using Pattern.matcher[].
return the modified sentence.

Below is the implementation of the above approach:

C++

#include

using namespace std;

string removeDuplicateWords[string s]

{

const regex pattern["\\b[\\w+][?:\\W+\\1\\b]+", regex_constants::icase];

string answer = s;

for [auto it = sregex_iterator[s.begin[], s.end[], pattern];

it != sregex_iterator[]; it++]

{

smatch match;

match = *it;

answer.replace[answer.find[match.str[0]], match.str[0].length[], match.str[1]];

}

return answer;

}

int main[]

{

string str1

= "Good bye bye world world";

cout


				
					

                 
	Bài Viết Liên Quan
	
	 	
		
		   
		   
		   
		
		
			Hướng dẫn dùng in print python

		
	

		
		
		   
		   
		   
		
		
			Hướng dẫn aria-label css

		
	

		
		
		   
		   
		   
		
		
			Hướng dẫn dùng python charts python

		
	

		
		
		   
		   
		   
		
		
			Thiết kế cơ sở dữ liệu mongodb

		
	

		
		
		   
		   
		   
		
		
			Hướng dẫn dùng datetime between trong PHP

		
	

		
		
		   
		   
		   
		
		
			Hướng dẫn ordereddict.fromkeys python

		
	

		
		
		   
		   
		   
		
		
			Hướng dẫn dùng php stdclass trong PHP

		
	

		
		
		   
		   
		   
		
		
			Hướng dẫn dùng cos -2pi/3 trong PHP

		
	

		
		
		   
		   
		   
		
		
			Hướng dẫn dùng as_array trong PHP

		
	

		
		
		   
		   
		   
		
		
			Kịch bản đại hội công đoàn cơ sở 2022 2023

		
	

		
		
		   
		   
		   
		
		
			What angle unit does python use?

		
	

		
		
		   
		   
		   
		
		
			Hướng dẫn dùng index.php trong PHP

		
	

		
		
		   
		   
		   
		
		
			Hướng dẫn dùng to.learn python

		
	

		
		
		   
		   
		   
		
		
			Hướng dẫn dùng letter string trong PHP

		
	

		
		
		   
		   
		   
		
		
			Hướng dẫn function php wordpress

		
	

		
		
		   
		   
		   
		
		
			Hướng dẫn dùng image0 png trong PHP

		
	

		
		
		   
		   
		   
		
		
			Hướng dẫn dùng text file python

		
	

		
		
		   
		   
		   
		
		
			Hướng dẫn dùng 5 close trong PHP

		
	

		
		
		   
		   
		   
		
		
			Hướng dẫn php modify xml file

		
	

		
		
		   
		   
		   
		
		
			New dictionary in python assignment expert

		
	

	
	




Toplist mới

 
	
	 
		#1
		
			Top 7 sự tích hồ gươm - ngữ văn lớp 6 2023
			7 tháng trước
		
	



	
	 
		#2
		
			Top 7 gdcd 6 bài 1 kết nối tri thức 2023
			7 tháng trước
		
	



	
	 
		#3
		
			Top 7 ý nghĩa của xây dựng gia đình văn hóa 2023
			7 tháng trước
		
	



	
	 
		#4
		
			Top 6 mẫu hợp đồng mượn đất làm nhà xưởng 2023
			7 tháng trước
		
	



	
	 
		#5
		
			Top 3 tổng tài biến thái tôi yêu anh tập 27 2023
			7 tháng trước
		
	



	
	 
		#6
		
			Top 6 kết thực phim mỹ nhân vô lệ 2023
			7 tháng trước
		
	



	
	 
		#7
		
			Top 9 trong những câu thơ sau câu nào sử dụng thành ngữ 2023
			7 tháng trước
		
	



	
	 
		#8
		
			Top 8 đề tài và chủ de của tác phẩm tắt đèn 2023
			7 tháng trước
		
	



	
	 
		#9
		
			Top 5 tiểu sử của thầy thích pháp hòa 2023
			7 tháng trước
		
	






		


	Bài mới nhất
	
	 	
		
		   
		   
		   
		
		
			B.n.n bị tố đạo văn 80 năm 2024

		
	

		
		
		   
		   
		   
		
		
			Cách dạy con học toán lớp 1 hiệu quả năm 2024

		
	

		
		
		   
		   
		   
		
		
			Làm thế nào để hết chuột rút bắp chân năm 2024

		
	

		
		
		   
		   
		   
		
		
			Bán đất đường trần quang diệu thành phố thanh hóa năm 2024

		
	

	
	
                 
	Chủ Đề
	
	
	
		  programming
		  Hỏi Đáp
		  Toplist
		  Là gì
		  Bài Tập
		  Địa Điểm Hay
		  Mẹo Hay
		  Học Tốt
		  Nghĩa của từ
		  Công Nghệ
		  Khỏe Đẹp
		  bao nhiêu
		  Top List
		  Tiếng anh
		  Bao nhiêu
		  Sản phẩm tốt
		  Xây Đựng
		  Ngôn ngữ
		  javascript
		  Ở đâu
		  Đại học
		  Hướng dẫn
		  Bài tập
		  Tại sao
		  Dịch 
		  So Sánh
		  Máy tính
		  Món Ngon
		  mẹo hay
		  Bao lâu
		  Thế nào
		  So sánh
		  Khoa Học
		  Vì sao
		  Lớp 9
		  Lớp 10