Quantcast

Question about carriage return / line feed handling with SCAP text_file_content_54 test

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Question about carriage return / line feed handling with SCAP text_file_content_54 test

Mark Sheahan
Hello,

We’re developing an OVAL interpreter, and have hit an issue running through the NIST SCAP validation suite data http://scap.nist.gov/validation/resources.html . Relevant files are attached for convenience. Our OVAL interpreter fails test oval:nist.validation.textFileContent54:def:73 because the text_file_content_54_item we produce contains ‘abcdefghijklmnopqrstuvwxyz\r’ instead of ‘abcdefghijklmnopqrstuvwxyz’.

My question is: should regular expression matches of dots treat \r\n as an end-of-line when on Windows, or just ‘\n’? Is it acceptable to convert \r\n to \n in the input data before regex matching? Is there a problem with the test data?


The problem might be that on windows, python text-mode file writing will insert carriage returns before newlines. text_file54_test_config_1.py has the following fragment:

    createFile(r"C:/scap_validation_content/ind_tfc_53/a/1.txt","abcdefghijklmnopqrstuvwxyz\n1234567890a")
    …
def createFile(filename, content):
    with open(filename, 'w') as file: <— notice file opened with ‘w’, not ‘wb’ for write-binary mode
        file.write(content)
    return

After running this script, the test file has a carriage-return inserted: abcdefghijklmnopqrstuvwxyz\r\n1234567890a, and it appears that the test-writer intended just a ‘\n’, not \r\n. Should the test script open the file in binary mode instead of text mode?

Thanks,
Mark


The information contained in this email may be confidential and/or legally privileged. It has been sent for the sole use of the intended recipient(s). If the reader of this message is not an intended recipient, you are hereby notified that any unauthorized review, use, disclosure, dissemination, distribution, or copying of this communication, or any of its contents, is strictly prohibited. If you have received this communication in error, please reply to the sender and destroy all copies of the message. To contact us directly, send to [hidden email] Thank you.  To unsubscribe, send an email message to [hidden email] with SIGNOFF OVAL-DEVELOPER-LIST in the BODY of the message. If you have difficulties, write to [hidden email].
To unsubscribe, send an email message to [hidden email] with SIGNOFF OVAL-DEVELOPER-LIST in the BODY of the message. If you have difficulties, write to [hidden email].





To unsubscribe, send an email message to [hidden email] with SIGNOFF OVAL-DEVELOPER-LIST in the BODY of the message. If you have difficulties, write to [hidden email].

ind_text_file_content_54_test-datastream.xml (272K) Download Attachment
text_file54_test_config_1.py (1K) Download Attachment
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Question about carriage return / line feed handling with SCAP text_file_content_54 test

David Solin-3
Hi Mark,

We ran into the same issue in the validation process; Python inserts the carriage return character on Windows which screws up the test.

There isn’t supposed to be any magical platform-dependent interpretation of newlines when performing a regular expression search for this type of test.

Best regards,
—David Solin

David A. Solin
Co-Founder, Research & Technology
[hidden email]

Joval Continuous Monitoring

Facebook Linkedin



On Aug 14, 2015, at 12:02 PM, Mark Sheahan <[hidden email]> wrote:

Hello,

We’re developing an OVAL interpreter, and have hit an issue running through the NIST SCAP validation suite data http://scap.nist.gov/validation/resources.html . Relevant files are attached for convenience. Our OVAL interpreter fails test oval:nist.validation.textFileContent54:def:73 because the text_file_content_54_item we produce contains ‘abcdefghijklmnopqrstuvwxyz\r’ instead of ‘abcdefghijklmnopqrstuvwxyz’.

My question is: should regular expression matches of dots treat \r\n as an end-of-line when on Windows, or just ‘\n’? Is it acceptable to convert \r\n to \n in the input data before regex matching? Is there a problem with the test data?


The problem might be that on windows, python text-mode file writing will insert carriage returns before newlines. text_file54_test_config_1.py has the following fragment:

    createFile(r"C:/scap_validation_content/ind_tfc_53/a/1.txt","abcdefghijklmnopqrstuvwxyz\n1234567890a")
    …
def createFile(filename, content):
    with open(filename, 'w') as file: <— notice file opened with ‘w’, not ‘wb’ for write-binary mode
        file.write(content)
    return

After running this script, the test file has a carriage-return inserted: abcdefghijklmnopqrstuvwxyz\r\n1234567890a, and it appears that the test-writer intended just a ‘\n’, not \r\n. Should the test script open the file in binary mode instead of text mode?

Thanks,
Mark


The information contained in this email may be confidential and/or legally privileged. It has been sent for the sole use of the intended recipient(s). If the reader of this message is not an intended recipient, you are hereby notified that any unauthorized review, use, disclosure, dissemination, distribution, or copying of this communication, or any of its contents, is strictly prohibited. If you have received this communication in error, please reply to the sender and destroy all copies of the message. To contact us directly, send to [hidden email] Thank you.  To unsubscribe, send an email message to [hidden email] with SIGNOFF OVAL-DEVELOPER-LIST in the BODY of the message. If you have difficulties, write to [hidden email].
To unsubscribe, send an email message to [hidden email] with SIGNOFF OVAL-DEVELOPER-LIST in the BODY of the message. If you have difficulties, write to [hidden email].<ind_text_file_content_54_test-datastream.xml><text_file54_test_config_1.py>





To unsubscribe, send an email message to [hidden email] with SIGNOFF OVAL-DEVELOPER-LIST in the BODY of the message. If you have difficulties, write to [hidden email].

To unsubscribe, send an email message to [hidden email] with SIGNOFF OVAL-DEVELOPER-LIST in the BODY of the message. If you have difficulties, write to [hidden email].
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Question about carriage return / line feed handling with SCAP text_file_content_54 test

Mark Sheahan
Thanks David, much appreciated. I hoped this would be the case; munging the input before analyzing it seems crazy to me.

Best Regards,
Mark


On Aug 14, 2015, at 10:50 AM, David Solin <[hidden email]> wrote:

Hi Mark,

We ran into the same issue in the validation process; Python inserts the carriage return character on Windows which screws up the test.

There isn’t supposed to be any magical platform-dependent interpretation of newlines when performing a regular expression search for this type of test.

Best regards,
—David Solin

David A. Solin
Co-Founder, Research & Technology
[hidden email]

Joval Continuous Monitoring

Facebook Linkedin



On Aug 14, 2015, at 12:02 PM, Mark Sheahan <[hidden email]> wrote:

Hello,

We’re developing an OVAL interpreter, and have hit an issue running through the NIST SCAP validation suite data http://scap.nist.gov/validation/resources.html . Relevant files are attached for convenience. Our OVAL interpreter fails test oval:nist.validation.textFileContent54:def:73 because the text_file_content_54_item we produce contains ‘abcdefghijklmnopqrstuvwxyz\r’ instead of ‘abcdefghijklmnopqrstuvwxyz’.

My question is: should regular expression matches of dots treat \r\n as an end-of-line when on Windows, or just ‘\n’? Is it acceptable to convert \r\n to \n in the input data before regex matching? Is there a problem with the test data?


The problem might be that on windows, python text-mode file writing will insert carriage returns before newlines. text_file54_test_config_1.py has the following fragment:

    createFile(r"C:/scap_validation_content/ind_tfc_53/a/1.txt","abcdefghijklmnopqrstuvwxyz\n1234567890a")
    …
def createFile(filename, content):
    with open(filename, 'w') as file: <— notice file opened with ‘w’, not ‘wb’ for write-binary mode
        file.write(content)
    return

After running this script, the test file has a carriage-return inserted: abcdefghijklmnopqrstuvwxyz\r\n1234567890a, and it appears that the test-writer intended just a ‘\n’, not \r\n. Should the test script open the file in binary mode instead of text mode?

Thanks,
Mark


The information contained in this email may be confidential and/or legally privileged. It has been sent for the sole use of the intended recipient(s). If the reader of this message is not an intended recipient, you are hereby notified that any unauthorized review, use, disclosure, dissemination, distribution, or copying of this communication, or any of its contents, is strictly prohibited. If you have received this communication in error, please reply to the sender and destroy all copies of the message. To contact us directly, send to [hidden email] Thank you.  To unsubscribe, send an email message to [hidden email] with SIGNOFF OVAL-DEVELOPER-LIST in the BODY of the message. If you have difficulties, write to [hidden email].
To unsubscribe, send an email message to [hidden email] with SIGNOFF OVAL-DEVELOPER-LIST in the BODY of the message. If you have difficulties, write to [hidden email].<ind_text_file_content_54_test-datastream.xml><text_file54_test_config_1.py>





To unsubscribe, send an email message to [hidden email] with SIGNOFF OVAL-DEVELOPER-LIST in the BODY of the message. If you have difficulties, write to [hidden email].

To unsubscribe, send an email message to [hidden email] with SIGNOFF OVAL-DEVELOPER-LIST in the BODY of the message. If you have difficulties, write to [hidden email].


The information contained in this email may be confidential and/or legally privileged. It has been sent for the sole use of the intended recipient(s). If the reader of this message is not an intended recipient, you are hereby notified that any unauthorized review, use, disclosure, dissemination, distribution, or copying of this communication, or any of its contents, is strictly prohibited. If you have received this communication in error, please reply to the sender and destroy all copies of the message. To contact us directly, send to [hidden email] Thank you.  To unsubscribe, send an email message to [hidden email] with SIGNOFF OVAL-DEVELOPER-LIST in the BODY of the message. If you have difficulties, write to [hidden email].
Loading...