Register | Login

Write a function to remove duplicate entries for any given XML. The node considered a duplicate, then provided "key" field is a duplicate.

Example XML:

<Products>
<Product>
<Name>Milk</Name>
<Amount>4</Amount>
</Product>
<Product>
<Name>Milk</Name>
<Amount>0.5</Amount>
</Product>
<Product>
<Name>Coffe</Name>
<Amount>0.5</Amount>
</Product>
</Products>

Based on the "Name" field, node 1 and 2 considered duplicated, but based on "Amount" field nodes 2 and 3 are duplicates. So, the task is to write a function:


string DeDup(string xml, string keyNode, string rootPath)


Possible solution:


private static string RemoveDuplicates(string xml, string key, string rootXPath)
{
XmlDocument doc = new XmlDocument();
List sb = new List();
string keyValue;
try
{
doc.Load(xml);
XmlElement root = doc.DocumentElement;
XmlNodeList xnodelist = root.SelectNodes(rootXPath);
int i=0;
foreach (XmlNode item in xnodelist)
{
i++;
keyValue = item.SelectSingleNode(key).InnerXml;
if (sb.Contains(keyValue))
xnode.RemoveChild(item);
else
sb.Add(keyValue);
}
return doc.OuterXml;
}
catch (Exception ex)
{
// Log exception...
throw ex;
}
}




This solution while works well for small xml file, is not a good fit for de-duping large XMLs.So, the bonus question will be to utilize SAX parser in C# to remove duplicates in large XML files...



Who Voted for this Question


Article



Common Interview is a place to help people keep up with the latest trends in job interviewing. You can interact by asking interview questions or by providing answers and ratings. Choose from thousands behavioural, technical, testing or program management questions and interview puzzles.