Friday Programming Quiz [modified]
-
In a language of your choice (no PE), implement the following:
string RemoveDuplicates(string csvString) {
}
The function should remove all duplicate values form a string containing comma separated values.
RemoveDuplicates("a,b,c,a") => "a,b,c"
RemoveDuplicates("a,b,c,a,c,b,c") => "a,b,c"
RemoveDuplicates("a,b,c") => "a,b,c"
RemoveDuplicates("cat,dog,dog") => "cat,dog"
The ideal implementation should have just 1 line of code.
Debugging is twice as hard as writing the code in the first place. Therefore, if you write the code as cleverly as possible, you are, by definition, not smart enough to debug it. -Brian Kernighan
This uses a string parser class I wrote (which is available in both MFC and STL versions here on codeproject. I used methods from memory, so they may not be precise, but this should do what you want. The added benefit is that the CQStringParser class supports quoted sub-strings. :)
CString RemoveDuplicates(CString strSource) { CQStringParser parser(strSource, ','); int nCount = parser.GetCount(); CStringArray strUniques; bool bFound = false; for (int i = 1; i <= nCount; i++) { CString strStart = parser.GetField(i); int nUniqueSize = strUniques.GetSize(); for (int j = 0; j < nUniqueSize; j++) { if (strStart.CompareNoCase(strUniques.GetAt(i)) == 0) { bFound = true; break; } } if (!bFound) { strUniques.Add(strStart); } } parser.RemoveAll(); int nUniqueSize = strUniques.GetSize(); for (int j = 0; j < nUniqueSize; j++) { parser.AddField(strUniques.GetAt(j)); } CString strResult = parser.RebuildOriginalString(); return strResult; }
"Why don't you tie a kerosene-soaked rag around your ankles so the ants won't climb up and eat your candy ass..." - Dale Earnhardt, 1997
-----
"...the staggering layers of obscenity in your statement make it a work of art on so many levels." - Jason Jystad, 10/26/2001 -
:-D Please tell me you just made that up. That isn't an actual example of PE programming, it can't be that would just be absurd.
Using the GridView is like trying to explain to someone else how to move a third person's hands in order to tie your shoelaces for you. -Chris Maunder
-
In a language of your choice (no PE), implement the following:
string RemoveDuplicates(string csvString) {
}
The function should remove all duplicate values form a string containing comma separated values.
RemoveDuplicates("a,b,c,a") => "a,b,c"
RemoveDuplicates("a,b,c,a,c,b,c") => "a,b,c"
RemoveDuplicates("a,b,c") => "a,b,c"
RemoveDuplicates("cat,dog,dog") => "cat,dog"
The ideal implementation should have just 1 line of code.
Debugging is twice as hard as writing the code in the first place. Therefore, if you write the code as cleverly as possible, you are, by definition, not smart enough to debug it. -Brian Kernighan
In Haskell,
import Data.List removeDuplicates csvStr = nub (map (delete ',') (groupBy (\a b -> b == ',') csvStr))
I had to write the 'split on ,' functionality, which takes most of the declaration (it's this bit
map (delete ',') (groupBy (\a b -> b == ',') csvStr)
), but Haskell handily has a 'remove duplicates from a list' function,nub
. [Edit]Whoops - forgot to reconstruct the string (also, didn't cope with multi-char strings)!import Data.List removeDuplicates csvStr = concat $ intersperse "," $ nub $ map (delete ',') (groupBy (\a b -> b /= ',') csvStr)
[/Edit] [Edit 2] And on further investigation of Haskell's libraries, there's a
splitRegex
function:import Data.List -- for intersperse, nub import Text.Regex -- for splitRegex, mkRegex removeDuplicates csvStr = concat $ intersperse "," $ nub $ splitRegex (mkRegex ",") csvStr
[/Edit 2]
-
Plain English Function Called "Remove Duplicates" with Argument Consisting of Comma Separated Values in a Character String Remove Duplicate Values From The Plain English Function Argument Consisting of Comma Separated Values in a Character String Return The Plain English Function Argument Consisting of Comma Separated Values in a Character String, But With Duplicate Values Removed End Of Plain English Function Called "Remove Duplicates" with Argument Consisting of Comma Separated Values in a Character String
Excuse me while I go hurl X|Jon Sagara When I grow up, I'm changing my name to Joe Kickass! My Site | My Blog | My Articles
This just won't ever get old... :laugh::laugh::laugh:
What's in a sig? This statement is false. Build a bridge and get over it. ~ Chris Maunder
-
In C#
string RemoveDuplicates(string csvString) { string[] x = csvString.Split(char.Parse(",")); System.Collections.Specialized.StringCollection c = new System.Collections.Specialized.StringCollection(); foreach (string y in x) { if (!c.Contains(y)) c.Add(y); } string result = ""; foreach (string z in c) { result += z + ","; } return result.Substring(0, result.Length - 1); }
how vital enterprise application are for proactive organizations leveraging collective synergy to think outside the box and formulate their key objectives into a win-win game plan with a quality-driven approach that focuses on empowering key players to drive-up their core competencies and increase expectations with an all-around initiative to drive up the bottom-line. But of course, that's all a "high level" overview of things --thedailywtf 3/21/06
public static string RemoveDuplicates ( string Subject ) { System.Text.StringBuilder result = new System.Text.StringBuilder ( Subject.Length ) ; System.Collections.Specialized.StringCollection dic = new System.Collections.Specialized.StringCollection() ; foreach ( string temp in Subject.Split ( new char[] { ',' } , System.StringSplitOptions.None ) ) { if ( !dic.Contains ( temp ) ) { dic.Add ( temp ) ; result.Append ( temp ) ; result.Append ( "," ) ; } } return ( result.Remove ( result.Length-1 , 1 ).ToString() ) ; }
-
In a language of your choice (no PE), implement the following:
string RemoveDuplicates(string csvString) {
}
The function should remove all duplicate values form a string containing comma separated values.
RemoveDuplicates("a,b,c,a") => "a,b,c"
RemoveDuplicates("a,b,c,a,c,b,c") => "a,b,c"
RemoveDuplicates("a,b,c") => "a,b,c"
RemoveDuplicates("cat,dog,dog") => "cat,dog"
The ideal implementation should have just 1 line of code.
Debugging is twice as hard as writing the code in the first place. Therefore, if you write the code as cleverly as possible, you are, by definition, not smart enough to debug it. -Brian Kernighan
Does it count if we write a class to implement a distinct StringCollection with an appropriate ToString() to do most of the work? Resultant function could be something like: string RemoveDuplicates(string csvString) { return ( (new DistinctStringCollection ( csvString.Split ( new char[] { ',' } ) )).ToString() ) ; }
-
Depends on the language (probably it is better to call 1 statement instead of 1 line): Something like this[^]
Debugging is twice as hard as writing the code in the first place. Therefore, if you write the code as cleverly as possible, you are, by definition, not smart enough to debug it. -Brian Kernighan
I just wanted to point out that "lines of code" is not a very worthwhile concept in relation to "modern programming languages".
-
In a language of your choice (no PE), implement the following:
string RemoveDuplicates(string csvString) {
}
The function should remove all duplicate values form a string containing comma separated values.
RemoveDuplicates("a,b,c,a") => "a,b,c"
RemoveDuplicates("a,b,c,a,c,b,c") => "a,b,c"
RemoveDuplicates("a,b,c") => "a,b,c"
RemoveDuplicates("cat,dog,dog") => "cat,dog"
The ideal implementation should have just 1 line of code.
Debugging is twice as hard as writing the code in the first place. Therefore, if you write the code as cleverly as possible, you are, by definition, not smart enough to debug it. -Brian Kernighan
function Reduce(str)
{
var ret = new Array();
var a = str.split(',');
for (var i in a)
{
if ( !ret[a[i]] ) ret.push(a[i]);
ret[a[i]] = true;
}
return ret.join(',');
}Or, if you can use 1.7:
function Reduce(str)
{
function Unique(a)
{
var o = {};
for each (var i in a)
{
if (!o[i]) yield i;
o[i] = true;
}
}return [i for (i in Unique(str.split(',')))].join(',');
} -
This is what LINQ is for:
return string.Join(",", csvString.Split(',').Distinct());
Edit: Note that it is also the most efficient solution - it's O(N) because Distinct() internally uses a hash table. The C++ set<> solutions are O(N log N), though probably faster in the real world. And everything running Contains() repeatedly will be O(N²). Not that anyone would store large amounts of data in CSV strings.... Second modification: Sadly, it won't work like that. Distinct() returns IEnumerable, but for some strange reason, Join only works with arrays. So if we don't get a new Join() overload in .NET 3.5, add a .ToArray() extension method call behind the Distinct().Last modified: 24mins after originally posted --
Daniel Grunwald wrote:
Not that anyone would store large amounts of data in CSV strings....
:laugh: How optimistic. :rolleyes:
Once you wanted revolution
Now you're the institution
How's it feel to be the man? -
This is what LINQ is for:
return string.Join(",", csvString.Split(',').Distinct());
Edit: Note that it is also the most efficient solution - it's O(N) because Distinct() internally uses a hash table. The C++ set<> solutions are O(N log N), though probably faster in the real world. And everything running Contains() repeatedly will be O(N²). Not that anyone would store large amounts of data in CSV strings.... Second modification: Sadly, it won't work like that. Distinct() returns IEnumerable, but for some strange reason, Join only works with arrays. So if we don't get a new Join() overload in .NET 3.5, add a .ToArray() extension method call behind the Distinct().Last modified: 24mins after originally posted --
Similarly in Python:
def RemoveDuplicates(text): return {}.fromkeys( [elem.strip() for elem in text.split(',')] ).keys()
(But I also stripped the spaces after splitting on commas, to allow "a, b, b,c,c,d" kind of stuff).Matt Gerrans
-
template < typename _Cont > void split(const std::string& str, _Cont& _container, const std::string& delim=",")
{
std::string::size_type lpos = 0;
std::string::size_type pos = str.find_first_of(delim, lpos);
while(lpos != std::string::npos)
{
_container.insert(_container.end(), str.substr(lpos,pos - lpos));lpos = ( pos == std::string::npos ) ? std::string::npos : pos + 1; pos = str.find\_first\_of(delim, lpos); }
}
std::string fn(std::string in)
{
std::string out;
std::set foo;
split(in, foo);for (std::set::iterator it=foo.begin();it!=foo.end();it++) { if ((\*it).size() > 0) { out+=(\*it); if (std::distance(it, foo.end()) > 1) out+=","; } } return out;
}
and you can count this as my code from CP entry for the day. why is IE (or CP?) putting that sentence inside the PRE ? it's not. it looked fine in FF2.0. -- modified at 19:14 Friday 3rd November, 2006
Boostified:
std::string fn(std::string in)
{
std::string out;typedef boost::tokenizer<boost::char\_separator<char> > tokenizer; tokenizer foo(in, boost::char\_separator<char>(","); tokenizer::iterator it = foo.begin(), end = foo.end(); while(it != end) { out += \*it++; if(it != end) out += ","; } return out;
}
There's probably a boost function somewhere which allows one to join strings as well, but I didn't bother to look. :)
-- Not based on the Novel by James Fenimore Cooper
-
In a language of your choice (no PE), implement the following:
string RemoveDuplicates(string csvString) {
}
The function should remove all duplicate values form a string containing comma separated values.
RemoveDuplicates("a,b,c,a") => "a,b,c"
RemoveDuplicates("a,b,c,a,c,b,c") => "a,b,c"
RemoveDuplicates("a,b,c") => "a,b,c"
RemoveDuplicates("cat,dog,dog") => "cat,dog"
The ideal implementation should have just 1 line of code.
Debugging is twice as hard as writing the code in the first place. Therefore, if you write the code as cleverly as possible, you are, by definition, not smart enough to debug it. -Brian Kernighan
print RemoveDuplicates("a,b,b,c,b,c");
sub RemoveDuplicates
{
foreach (split (/,/,$_[0])) { $_{$_} = $_; }
join (",",keys %_);
} -
Boostified:
std::string fn(std::string in)
{
std::string out;typedef boost::tokenizer<boost::char\_separator<char> > tokenizer; tokenizer foo(in, boost::char\_separator<char>(","); tokenizer::iterator it = foo.begin(), end = foo.end(); while(it != end) { out += \*it++; if(it != end) out += ","; } return out;
}
There's probably a boost function somewhere which allows one to join strings as well, but I didn't bother to look. :)
-- Not based on the Novel by James Fenimore Cooper
someday i'll try to figure out how to use boost again. the last time i tried, it was a total installation, dependency, compiler configuration nightmare. definitely not the kind of thing i wanted to get into, just to use their regexp classes.
-
someday i'll try to figure out how to use boost again. the last time i tried, it was a total installation, dependency, compiler configuration nightmare. definitely not the kind of thing i wanted to get into, just to use their regexp classes.
I've had no problems with the releases since 1.30. This has been with VS 2k3 - don't know what'll happen with VC6 or 2k5.
-- -= Proudly Made on Earth =-
-
Boostified:
std::string fn(std::string in)
{
std::string out;typedef boost::tokenizer<boost::char\_separator<char> > tokenizer; tokenizer foo(in, boost::char\_separator<char>(","); tokenizer::iterator it = foo.begin(), end = foo.end(); while(it != end) { out += \*it++; if(it != end) out += ","; } return out;
}
There's probably a boost function somewhere which allows one to join strings as well, but I didn't bother to look. :)
-- Not based on the Novel by James Fenimore Cooper
-
Does it count if we write a class to implement a distinct StringCollection with an appropriate ToString() to do most of the work? Resultant function could be something like: string RemoveDuplicates(string csvString) { return ( (new DistinctStringCollection ( csvString.Split ( new char[] { ',' } ) )).ToString() ) ; }
Yeah piece of cake:
return (new [UniqueStringList](http://www.codeproject.com/csharp/uniquestringlist.asp)[[^](http://www.codeproject.com/csharp/uniquestringlist.asp)](csvString.Split(new char[] {','})).ToString("", ",", "");
It is kind of cheating though ;P
"..Commit yourself to quality from day one..it's better to do nothing at all than to do something badly.." -- Mark McCormick
|| Fold With Us! || Pensieve || VG.Net ||
-
print RemoveDuplicates("a,b,b,c,b,c");
sub RemoveDuplicates
{
foreach (split (/,/,$_[0])) { $_{$_} = $_; }
join (",",keys %_);
}Ah.. Perl syntax. Gives me the shiver every time. ;)
-- Now with chucklelin
-
Yeah piece of cake:
return (new [UniqueStringList](http://www.codeproject.com/csharp/uniquestringlist.asp)[[^](http://www.codeproject.com/csharp/uniquestringlist.asp)](csvString.Split(new char[] {','})).ToString("", ",", "");
It is kind of cheating though ;P
"..Commit yourself to quality from day one..it's better to do nothing at all than to do something badly.." -- Mark McCormick
|| Fold With Us! || Pensieve || VG.Net ||
Ah, very good, I'll have to take a deeper look at UniqueStringList. Here's what I just whipped up: public partial class DistinctStringCollection : System.Collections.Specialized.StringCollection { public static DistinctStringCollection FromCSV ( string CSV ) { DistinctStringCollection result = new DistinctStringCollection() ; foreach ( string temp in CSV.Trim ( new char[] { ',' } ).Split ( new char[] { ',' } ) ) { if ( !result.Contains ( temp ) ) { result.Add ( temp ) ; } } return ( result ) ; } public string ToCSV ( ) { System.Text.StringBuilder result = new System.Text.StringBuilder() ; foreach ( string temp in this ) { result.Append ( temp ) ; result.Append ( "," ) ; } return ( result.Remove ( result.Length-1 , 1 ).ToString() ) ; } } And then... public static string RemoveDuplicates ( string Subject ) { return ( DistinctStringCollection.FromCSV ( Subject ).ToCSV() ) ; }
-
In a language of your choice (no PE), implement the following:
string RemoveDuplicates(string csvString) {
}
The function should remove all duplicate values form a string containing comma separated values.
RemoveDuplicates("a,b,c,a") => "a,b,c"
RemoveDuplicates("a,b,c,a,c,b,c") => "a,b,c"
RemoveDuplicates("a,b,c") => "a,b,c"
RemoveDuplicates("cat,dog,dog") => "cat,dog"
The ideal implementation should have just 1 line of code.
Debugging is twice as hard as writing the code in the first place. Therefore, if you write the code as cleverly as possible, you are, by definition, not smart enough to debug it. -Brian Kernighan
Perl (not tested, may or may not work) ;)
print join ( grep { ++$tokens{$_} == 1 } split(/,/), ',' )
--Mike-- Visual C++ MVP :cool: LINKS~! Ericahist | PimpFish | CP SearchBar v3.0 | C++ Forum FAQ
-
print RemoveDuplicates("a,b,b,c,b,c");
sub RemoveDuplicates
{
foreach (split (/,/,$_[0])) { $_{$_} = $_; }
join (",",keys %_);
}Does
keys
return the keys in the same order as they were inserted? When I did my solution below, I consciously kept the order of the words the same in the output.--Mike-- Visual C++ MVP :cool: LINKS~! Ericahist | PimpFish | CP SearchBar v3.0 | C++ Forum FAQ