Friday Programming Quiz [modified]
-
Depends on the language (probably it is better to call 1 statement instead of 1 line): Something like this[^]
Debugging is twice as hard as writing the code in the first place. Therefore, if you write the code as cleverly as possible, you are, by definition, not smart enough to debug it. -Brian Kernighan
I just wanted to point out that "lines of code" is not a very worthwhile concept in relation to "modern programming languages".
-
In a language of your choice (no PE), implement the following:
string RemoveDuplicates(string csvString) {
}
The function should remove all duplicate values form a string containing comma separated values.
RemoveDuplicates("a,b,c,a") => "a,b,c"
RemoveDuplicates("a,b,c,a,c,b,c") => "a,b,c"
RemoveDuplicates("a,b,c") => "a,b,c"
RemoveDuplicates("cat,dog,dog") => "cat,dog"
The ideal implementation should have just 1 line of code.
Debugging is twice as hard as writing the code in the first place. Therefore, if you write the code as cleverly as possible, you are, by definition, not smart enough to debug it. -Brian Kernighan
function Reduce(str)
{
var ret = new Array();
var a = str.split(',');
for (var i in a)
{
if ( !ret[a[i]] ) ret.push(a[i]);
ret[a[i]] = true;
}
return ret.join(',');
}Or, if you can use 1.7:
function Reduce(str)
{
function Unique(a)
{
var o = {};
for each (var i in a)
{
if (!o[i]) yield i;
o[i] = true;
}
}return [i for (i in Unique(str.split(',')))].join(',');
} -
This is what LINQ is for:
return string.Join(",", csvString.Split(',').Distinct());
Edit: Note that it is also the most efficient solution - it's O(N) because Distinct() internally uses a hash table. The C++ set<> solutions are O(N log N), though probably faster in the real world. And everything running Contains() repeatedly will be O(N²). Not that anyone would store large amounts of data in CSV strings.... Second modification: Sadly, it won't work like that. Distinct() returns IEnumerable, but for some strange reason, Join only works with arrays. So if we don't get a new Join() overload in .NET 3.5, add a .ToArray() extension method call behind the Distinct().Last modified: 24mins after originally posted --
Daniel Grunwald wrote:
Not that anyone would store large amounts of data in CSV strings....
:laugh: How optimistic. :rolleyes:
Once you wanted revolution
Now you're the institution
How's it feel to be the man? -
This is what LINQ is for:
return string.Join(",", csvString.Split(',').Distinct());
Edit: Note that it is also the most efficient solution - it's O(N) because Distinct() internally uses a hash table. The C++ set<> solutions are O(N log N), though probably faster in the real world. And everything running Contains() repeatedly will be O(N²). Not that anyone would store large amounts of data in CSV strings.... Second modification: Sadly, it won't work like that. Distinct() returns IEnumerable, but for some strange reason, Join only works with arrays. So if we don't get a new Join() overload in .NET 3.5, add a .ToArray() extension method call behind the Distinct().Last modified: 24mins after originally posted --
Similarly in Python:
def RemoveDuplicates(text): return {}.fromkeys( [elem.strip() for elem in text.split(',')] ).keys()
(But I also stripped the spaces after splitting on commas, to allow "a, b, b,c,c,d" kind of stuff).Matt Gerrans
-
template < typename _Cont > void split(const std::string& str, _Cont& _container, const std::string& delim=",")
{
std::string::size_type lpos = 0;
std::string::size_type pos = str.find_first_of(delim, lpos);
while(lpos != std::string::npos)
{
_container.insert(_container.end(), str.substr(lpos,pos - lpos));lpos = ( pos == std::string::npos ) ? std::string::npos : pos + 1; pos = str.find\_first\_of(delim, lpos); }
}
std::string fn(std::string in)
{
std::string out;
std::set foo;
split(in, foo);for (std::set::iterator it=foo.begin();it!=foo.end();it++) { if ((\*it).size() > 0) { out+=(\*it); if (std::distance(it, foo.end()) > 1) out+=","; } } return out;
}
and you can count this as my code from CP entry for the day. why is IE (or CP?) putting that sentence inside the PRE ? it's not. it looked fine in FF2.0. -- modified at 19:14 Friday 3rd November, 2006
Boostified:
std::string fn(std::string in)
{
std::string out;typedef boost::tokenizer<boost::char\_separator<char> > tokenizer; tokenizer foo(in, boost::char\_separator<char>(","); tokenizer::iterator it = foo.begin(), end = foo.end(); while(it != end) { out += \*it++; if(it != end) out += ","; } return out;
}
There's probably a boost function somewhere which allows one to join strings as well, but I didn't bother to look. :)
-- Not based on the Novel by James Fenimore Cooper
-
In a language of your choice (no PE), implement the following:
string RemoveDuplicates(string csvString) {
}
The function should remove all duplicate values form a string containing comma separated values.
RemoveDuplicates("a,b,c,a") => "a,b,c"
RemoveDuplicates("a,b,c,a,c,b,c") => "a,b,c"
RemoveDuplicates("a,b,c") => "a,b,c"
RemoveDuplicates("cat,dog,dog") => "cat,dog"
The ideal implementation should have just 1 line of code.
Debugging is twice as hard as writing the code in the first place. Therefore, if you write the code as cleverly as possible, you are, by definition, not smart enough to debug it. -Brian Kernighan
print RemoveDuplicates("a,b,b,c,b,c");
sub RemoveDuplicates
{
foreach (split (/,/,$_[0])) { $_{$_} = $_; }
join (",",keys %_);
} -
Boostified:
std::string fn(std::string in)
{
std::string out;typedef boost::tokenizer<boost::char\_separator<char> > tokenizer; tokenizer foo(in, boost::char\_separator<char>(","); tokenizer::iterator it = foo.begin(), end = foo.end(); while(it != end) { out += \*it++; if(it != end) out += ","; } return out;
}
There's probably a boost function somewhere which allows one to join strings as well, but I didn't bother to look. :)
-- Not based on the Novel by James Fenimore Cooper
someday i'll try to figure out how to use boost again. the last time i tried, it was a total installation, dependency, compiler configuration nightmare. definitely not the kind of thing i wanted to get into, just to use their regexp classes.
-
someday i'll try to figure out how to use boost again. the last time i tried, it was a total installation, dependency, compiler configuration nightmare. definitely not the kind of thing i wanted to get into, just to use their regexp classes.
I've had no problems with the releases since 1.30. This has been with VS 2k3 - don't know what'll happen with VC6 or 2k5.
-- -= Proudly Made on Earth =-
-
Boostified:
std::string fn(std::string in)
{
std::string out;typedef boost::tokenizer<boost::char\_separator<char> > tokenizer; tokenizer foo(in, boost::char\_separator<char>(","); tokenizer::iterator it = foo.begin(), end = foo.end(); while(it != end) { out += \*it++; if(it != end) out += ","; } return out;
}
There's probably a boost function somewhere which allows one to join strings as well, but I didn't bother to look. :)
-- Not based on the Novel by James Fenimore Cooper
-
Does it count if we write a class to implement a distinct StringCollection with an appropriate ToString() to do most of the work? Resultant function could be something like: string RemoveDuplicates(string csvString) { return ( (new DistinctStringCollection ( csvString.Split ( new char[] { ',' } ) )).ToString() ) ; }
Yeah piece of cake:
return (new [UniqueStringList](http://www.codeproject.com/csharp/uniquestringlist.asp)[[^](http://www.codeproject.com/csharp/uniquestringlist.asp)](csvString.Split(new char[] {','})).ToString("", ",", "");
It is kind of cheating though ;P
"..Commit yourself to quality from day one..it's better to do nothing at all than to do something badly.." -- Mark McCormick
|| Fold With Us! || Pensieve || VG.Net ||
-
print RemoveDuplicates("a,b,b,c,b,c");
sub RemoveDuplicates
{
foreach (split (/,/,$_[0])) { $_{$_} = $_; }
join (",",keys %_);
}Ah.. Perl syntax. Gives me the shiver every time. ;)
-- Now with chucklelin
-
Yeah piece of cake:
return (new [UniqueStringList](http://www.codeproject.com/csharp/uniquestringlist.asp)[[^](http://www.codeproject.com/csharp/uniquestringlist.asp)](csvString.Split(new char[] {','})).ToString("", ",", "");
It is kind of cheating though ;P
"..Commit yourself to quality from day one..it's better to do nothing at all than to do something badly.." -- Mark McCormick
|| Fold With Us! || Pensieve || VG.Net ||
Ah, very good, I'll have to take a deeper look at UniqueStringList. Here's what I just whipped up: public partial class DistinctStringCollection : System.Collections.Specialized.StringCollection { public static DistinctStringCollection FromCSV ( string CSV ) { DistinctStringCollection result = new DistinctStringCollection() ; foreach ( string temp in CSV.Trim ( new char[] { ',' } ).Split ( new char[] { ',' } ) ) { if ( !result.Contains ( temp ) ) { result.Add ( temp ) ; } } return ( result ) ; } public string ToCSV ( ) { System.Text.StringBuilder result = new System.Text.StringBuilder() ; foreach ( string temp in this ) { result.Append ( temp ) ; result.Append ( "," ) ; } return ( result.Remove ( result.Length-1 , 1 ).ToString() ) ; } } And then... public static string RemoveDuplicates ( string Subject ) { return ( DistinctStringCollection.FromCSV ( Subject ).ToCSV() ) ; }
-
In a language of your choice (no PE), implement the following:
string RemoveDuplicates(string csvString) {
}
The function should remove all duplicate values form a string containing comma separated values.
RemoveDuplicates("a,b,c,a") => "a,b,c"
RemoveDuplicates("a,b,c,a,c,b,c") => "a,b,c"
RemoveDuplicates("a,b,c") => "a,b,c"
RemoveDuplicates("cat,dog,dog") => "cat,dog"
The ideal implementation should have just 1 line of code.
Debugging is twice as hard as writing the code in the first place. Therefore, if you write the code as cleverly as possible, you are, by definition, not smart enough to debug it. -Brian Kernighan
Perl (not tested, may or may not work) ;)
print join ( grep { ++$tokens{$_} == 1 } split(/,/), ',' )
--Mike-- Visual C++ MVP :cool: LINKS~! Ericahist | PimpFish | CP SearchBar v3.0 | C++ Forum FAQ
-
print RemoveDuplicates("a,b,b,c,b,c");
sub RemoveDuplicates
{
foreach (split (/,/,$_[0])) { $_{$_} = $_; }
join (",",keys %_);
}Does
keys
return the keys in the same order as they were inserted? When I did my solution below, I consciously kept the order of the words the same in the output.--Mike-- Visual C++ MVP :cool: LINKS~! Ericahist | PimpFish | CP SearchBar v3.0 | C++ Forum FAQ
-
In a language of your choice (no PE), implement the following:
string RemoveDuplicates(string csvString) {
}
The function should remove all duplicate values form a string containing comma separated values.
RemoveDuplicates("a,b,c,a") => "a,b,c"
RemoveDuplicates("a,b,c,a,c,b,c") => "a,b,c"
RemoveDuplicates("a,b,c") => "a,b,c"
RemoveDuplicates("cat,dog,dog") => "cat,dog"
The ideal implementation should have just 1 line of code.
Debugging is twice as hard as writing the code in the first place. Therefore, if you write the code as cleverly as possible, you are, by definition, not smart enough to debug it. -Brian Kernighan
It's a becoming a bit of addiction to wait for friday programming post... :) hmm but I'm simply watching it now. I'll start posting my version soon :). Very nice Rama :).. in particular everytime I look for Nish to responding with his code :-D...that;s cool. Also at the end you can post your own answer right?
:Gong: 歡迎光臨 吐 西批 :Gong:
-
Does
keys
return the keys in the same order as they were inserted? When I did my solution below, I consciously kept the order of the words the same in the output.--Mike-- Visual C++ MVP :cool: LINKS~! Ericahist | PimpFish | CP SearchBar v3.0 | C++ Forum FAQ
Michael Dunn wrote:
Does keys return the keys in the same order as they were inserted?
From the perldoc: "The keys are returned in an apparently random order. The actual random order is subject to change in future versions of perl. Since Perl 5.8.1 the ordering is different even between different runs of Perl for security reasons." But that was not an requirement ;P
-
Well, I should have stated that the values are strings not just single characters.
Debugging is twice as hard as writing the code in the first place. Therefore, if you write the code as cleverly as possible, you are, by definition, not smart enough to debug it. -Brian Kernighan
-
Ah, very good, I'll have to take a deeper look at UniqueStringList. Here's what I just whipped up: public partial class DistinctStringCollection : System.Collections.Specialized.StringCollection { public static DistinctStringCollection FromCSV ( string CSV ) { DistinctStringCollection result = new DistinctStringCollection() ; foreach ( string temp in CSV.Trim ( new char[] { ',' } ).Split ( new char[] { ',' } ) ) { if ( !result.Contains ( temp ) ) { result.Add ( temp ) ; } } return ( result ) ; } public string ToCSV ( ) { System.Text.StringBuilder result = new System.Text.StringBuilder() ; foreach ( string temp in this ) { result.Append ( temp ) ; result.Append ( "," ) ; } return ( result.Remove ( result.Length-1 , 1 ).ToString() ) ; } } And then... public static string RemoveDuplicates ( string Subject ) { return ( DistinctStringCollection.FromCSV ( Subject ).ToCSV() ) ; }
Well at least I had something about which to think over the weekend. I've decided that implementing a DistinctStringCollection (or UniqueStringList) isn't very worthwhile (to me). (Nor is writing a RemoveDuplicates that only works on strings.) Having a method to make a CSV from any IEnumerable is: public static string CSVify ( System.Collections.IEnumerable Subject ) { System.Text.StringBuilder result = new System.Text.StringBuilder() ; foreach ( object temp in Subject ) { if ( temp != null ) { result.Append ( temp.ToString() ) ; result.Append ( "," ) ; } } return ( result.Remove ( result.Length-1 , 1 ).ToString() ) ; } Having methods to remove duplicates from (nearly?) any array or IList is: public static T[] RemoveDuplicates ( T[] Subject ) { T[] result = new T [ Subject.Length ] ; foreach ( T temp in Subject ) { for ( int index = 0 ; index < result.Length ; index++ ) { if ( result [ index ] == null ) { result [ index ] = temp ; break ; } else { if ( temp.Equals ( result [ index ] ) ) { break ; } } } } return ( result ) ; } public static T RemoveDuplicates ( T Subject ) where T : System.Collections.IList , new() { T result = new T() ; foreach ( object temp in Subject ) { if ( !result.Contains ( temp ) ) { result.Add ( temp ) ; } } return ( result ) ; } A string-only RemoveDuplicates can then be written as: public static string RemoveDuplicates ( string Subject ) { return ( CSVify ( RemoveDuplicates ( Subject.Split ( new char[] {