regex - Generic solution for removing xml declararation using perl -


hi want remove declaration in xml file , problem declaration embed root element.

xml looks follows

case1:

<?xml version="1.0" encoding="utf-8"?> <document> document root <child>----</child> </document>` 

case 2:

<?xml version="1.0" encoding="utf-8"?>  <document> document root <child>----</child> </document>` 

function should work case when root node in next line.

my function works case 2..

sub getxmldata {   ($xml) = @_;   @data = ();   open(file,"<$xml");   while(<file>) {     chomp;     if(/\<\?xml\sversion/) {next;}     push(@data, $_);       }   close(file);   return join("\n",@data); 

}

*** please note encoding not constant always.

ok, problem here - you're trying parse xml line based, , doesn't work. should avoid doing it, because makes brittle code, 1 day break - you've noted - valid changes source xml. both documents semantically identical, fact code handles 1 , not other example of why doing xml way bad idea.

more importantly though - why trying remove xml declaration xml? trying accomplish?

generically reformatting xml can done this:

#!/usr/bin/perl  use strict; use warnings;  use xml::twig;  $twig = xml::twig->new(     pretty_print  => 'indented', ); $twig->parsefile('your_xml_file'); $twig->print; 

this parse xml , reformat in one of valid ways xml may formatted. urge not discard xml declaration, , instead carry on xml::twig process it. (open new question you're trying accomplish, , i'll happily give solution doesn't trip different valid formats of xml).

when comes merging xml documents, xml::twig can - , still check , validate xml goes.

so might (extending above):

foreach $file ( @file_list ) {   $child = xml::twig -> new ();    $child -> parsefile ( $xml_file );    $child_doc = $child -> root -> cut;   $child_doc -> paste ( $twig -> root ); }  $twig -> print; 

exactly you'd need do, depends little on desired output structure - you'd need 'wrap' in root element anyway. open new question sample input , desired output, , i'll happily take crack @ it.

as example - if feed above sample input twice, get:

<?xml version="1.0" encoding="utf-8"?> <document><document> document root <child>----</child></document> document root <child>----</child></document> 

which know isn't want, illustrates parser based way of xml restructuring.


Comments

Popular posts from this blog

android - MPAndroidChart - How to add Annotations or images to the chart -

javascript - Add class to another page attribute using URL id - Jquery -

firefox - Where is 'webgl.osmesalib' parameter? -