Update on CWE-20 (improper input validation) and other difficult CWEs

classic Classic list List threaded Threaded
11 messages Options
Reply | Threaded
Open this post in threaded view
|

Update on CWE-20 (improper input validation) and other difficult CWEs

Christey, Steven M.

All,

We have been more closely investigating CWE-20 and its apparent inconsistencies with the validation-oriented category CWE-1215.  We will make  appropriate modifications in CWE 4.1, which we hope to release in the early summer.  At the very least, CWE-20 needs to be modified so that CWE-1215 is not seen as a viable alternative by consumers. Generally, category CWEs should not be used for mapping, except perhaps in rollup summaries for code analysis tools. The need to use a category instead of a weakness indicates some kind of problem that needs to be addressed.

Since we are also working on this year's CWE Top 25, we will be taking special note of the reasons why CWE-20 is used so heavily in NVD.  We will investigate how often "improper input validation" is the only weakness information given by the original vendor; when it's effectively used as a placeholder for rare or product-specific weaknesses that are difficult to analyze, suggesting opportunities for training and guidance; when it indicates gaps or usability problems in CWE itself; and other possible reasons.

We believe that some commonly-used, class-level CWE entries may be used too often because the entries themselves are not well organized.  Some other class-level entries on the Top 25 may have similar problems, such as CWE-284 improper access control. People have spent decades studying and classifying buffer overflows and injection problems, but some parts of CWE could benefit from more targeted research, including CWE-20.

If you have your own critiques of CWE-20 itself, or how it's used by others, please let us know, either on this list or privately. We also welcome any suggestions on how it could be improved.

We will keep you informed of our progress over the coming weeks.


Thank you,
Steve Christey Coley

CWE Technical Lead

 

Reply | Threaded
Open this post in threaded view
|

[EXT] Re: Update on CWE-20 (improper input validation) and other difficult CWEs

Kurt Seifried
I would like to see CWE-20 split out into subtypes at a minimum. What kind of improper validation? Assumption about the data? Flaw in the validation itself? Data types? Data format? Data content? Data length (truncation and too long)?

On Tue, May 19, 2020 at 8:31 AM Steven M Christey <[hidden email]> wrote:

All,

We have been more closely investigating CWE-20 and its apparent inconsistencies with the validation-oriented category CWE-1215.  We will make  appropriate modifications in CWE 4.1, which we hope to release in the early summer.  At the very least, CWE-20 needs to be modified so that CWE-1215 is not seen as a viable alternative by consumers. Generally, category CWEs should not be used for mapping, except perhaps in rollup summaries for code analysis tools. The need to use a category instead of a weakness indicates some kind of problem that needs to be addressed.

Since we are also working on this year's CWE Top 25, we will be taking special note of the reasons why CWE-20 is used so heavily in NVD.  We will investigate how often "improper input validation" is the only weakness information given by the original vendor; when it's effectively used as a placeholder for rare or product-specific weaknesses that are difficult to analyze, suggesting opportunities for training and guidance; when it indicates gaps or usability problems in CWE itself; and other possible reasons.

We believe that some commonly-used, class-level CWE entries may be used too often because the entries themselves are not well organized.  Some other class-level entries on the Top 25 may have similar problems, such as CWE-284 improper access control. People have spent decades studying and classifying buffer overflows and injection problems, but some parts of CWE could benefit from more targeted research, including CWE-20.

If you have your own critiques of CWE-20 itself, or how it's used by others, please let us know, either on this list or privately. We also welcome any suggestions on how it could be improved.

We will keep you informed of our progress over the coming weeks.


Thank you,
Steve Christey Coley

CWE Technical Lead

 



--
Kurt Seifried
[hidden email]
Reply | Threaded
Open this post in threaded view
|

RE: [EXT] Re: Update on CWE-20 (improper input validation) and other difficult CWEs

Christey, Steven M.
In reply to this post by Christey, Steven M.

Kurt,

 

In terms of splitting out CWE-20, you’re suggesting something similar to what I’ve been thinking in terms of splitting out CWE-20, which already has various children.  One could imagine types of validation related to size or quantity, position or index, syntactic correctness, semantic correctness, consistency/relationships between multiple data elements, appropriate type, etc.  Our analysis of CWE-20 issues for the Top 25 will effectively provide us with some real-world examples that we can look at to see if these kinds of splits make sense, as will learning about what code analysis tools examine  and currently label as CWE-20.

 

- Steve

 

 

From: Kurt Seifried <[hidden email]>
Sent: Tuesday, May 19, 2020 11:03 AM
To: Steven M Christey <[hidden email]>
Cc: CWE Research Discussion <[hidden email]>
Subject: [EXT] Re: Update on CWE-20 (improper input validation) and other difficult CWEs

 

I would like to see CWE-20 split out into subtypes at a minimum. What kind of improper validation? Assumption about the data? Flaw in the validation itself? Data types? Data format? Data content? Data length (truncation and too long)?

 

On Tue, May 19, 2020 at 8:31 AM Steven M Christey <[hidden email]> wrote:

All,

We have been more closely investigating CWE-20 and its apparent inconsistencies with the validation-oriented category CWE-1215.  We will make  appropriate modifications in CWE 4.1, which we hope to release in the early summer.  At the very least, CWE-20 needs to be modified so that CWE-1215 is not seen as a viable alternative by consumers. Generally, category CWEs should not be used for mapping, except perhaps in rollup summaries for code analysis tools. The need to use a category instead of a weakness indicates some kind of problem that needs to be addressed.

Since we are also working on this year's CWE Top 25, we will be taking special note of the reasons why CWE-20 is used so heavily in NVD.  We will investigate how often "improper input validation" is the only weakness information given by the original vendor; when it's effectively used as a placeholder for rare or product-specific weaknesses that are difficult to analyze, suggesting opportunities for training and guidance; when it indicates gaps or usability problems in CWE itself; and other possible reasons.

We believe that some commonly-used, class-level CWE entries may be used too often because the entries themselves are not well organized.  Some other class-level entries on the Top 25 may have similar problems, such as CWE-284 improper access control. People have spent decades studying and classifying buffer overflows and injection problems, but some parts of CWE could benefit from more targeted research, including CWE-20.

If you have your own critiques of CWE-20 itself, or how it's used by others, please let us know, either on this list or privately. We also welcome any suggestions on how it could be improved.

We will keep you informed of our progress over the coming weeks.


Thank you,
Steve Christey Coley

CWE Technical Lead

 


 

--

Kurt Seifried
[hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: [EXT] Re: Update on CWE-20 (improper input validation) and other difficult CWEs

Steve Overland
This will be an interesting discussion I'm sure.

I am about to submit an input validation training session proposal for the upcoming OWASP app sec virtual conference.  A deep dive into character encoding.

Offhand comment...  I think a big reason CWE-20 would be referenced so often is...  well...  what percentage of attacks are NOT input-based?

- Steve



On Tue, May 19, 2020 at 12:20 PM Steven M Christey <[hidden email]> wrote:

Kurt,

 

In terms of splitting out CWE-20, you’re suggesting something similar to what I’ve been thinking in terms of splitting out CWE-20, which already has various children.  One could imagine types of validation related to size or quantity, position or index, syntactic correctness, semantic correctness, consistency/relationships between multiple data elements, appropriate type, etc.  Our analysis of CWE-20 issues for the Top 25 will effectively provide us with some real-world examples that we can look at to see if these kinds of splits make sense, as will learning about what code analysis tools examine  and currently label as CWE-20.

 

- Steve

 

 

From: Kurt Seifried <[hidden email]>
Sent: Tuesday, May 19, 2020 11:03 AM
To: Steven M Christey <[hidden email]>
Cc: CWE Research Discussion <[hidden email]>
Subject: [EXT] Re: Update on CWE-20 (improper input validation) and other difficult CWEs

 

I would like to see CWE-20 split out into subtypes at a minimum. What kind of improper validation? Assumption about the data? Flaw in the validation itself? Data types? Data format? Data content? Data length (truncation and too long)?

 

On Tue, May 19, 2020 at 8:31 AM Steven M Christey <[hidden email]> wrote:

All,

We have been more closely investigating CWE-20 and its apparent inconsistencies with the validation-oriented category CWE-1215.  We will make  appropriate modifications in CWE 4.1, which we hope to release in the early summer.  At the very least, CWE-20 needs to be modified so that CWE-1215 is not seen as a viable alternative by consumers. Generally, category CWEs should not be used for mapping, except perhaps in rollup summaries for code analysis tools. The need to use a category instead of a weakness indicates some kind of problem that needs to be addressed.

Since we are also working on this year's CWE Top 25, we will be taking special note of the reasons why CWE-20 is used so heavily in NVD.  We will investigate how often "improper input validation" is the only weakness information given by the original vendor; when it's effectively used as a placeholder for rare or product-specific weaknesses that are difficult to analyze, suggesting opportunities for training and guidance; when it indicates gaps or usability problems in CWE itself; and other possible reasons.

We believe that some commonly-used, class-level CWE entries may be used too often because the entries themselves are not well organized.  Some other class-level entries on the Top 25 may have similar problems, such as CWE-284 improper access control. People have spent decades studying and classifying buffer overflows and injection problems, but some parts of CWE could benefit from more targeted research, including CWE-20.

If you have your own critiques of CWE-20 itself, or how it's used by others, please let us know, either on this list or privately. We also welcome any suggestions on how it could be improved.

We will keep you informed of our progress over the coming weeks.


Thank you,
Steve Christey Coley

CWE Technical Lead

 


 

--

Kurt Seifried
[hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: [EXT] Re: Update on CWE-20 (improper input validation) and other difficult CWEs

Kurt Seifried
I mean in theory everything could be caught by some sort of input validation (e.g. if you know the max size a buffer should be you could add a check to ensure it isn't over that and avoid a buffer overflow). That's not helpful though, it's like saying "the secret to being rich is to make more money than you spend". I think the only real option is to add subtypes and then see if they need to be/can be split out into new things. 

On Wed, May 20, 2020 at 2:11 PM Steve Overland <[hidden email]> wrote:
This will be an interesting discussion I'm sure.

I am about to submit an input validation training session proposal for the upcoming OWASP app sec virtual conference.  A deep dive into character encoding.

Offhand comment...  I think a big reason CWE-20 would be referenced so often is...  well...  what percentage of attacks are NOT input-based?

- Steve



On Tue, May 19, 2020 at 12:20 PM Steven M Christey <[hidden email]> wrote:

Kurt,

 

In terms of splitting out CWE-20, you’re suggesting something similar to what I’ve been thinking in terms of splitting out CWE-20, which already has various children.  One could imagine types of validation related to size or quantity, position or index, syntactic correctness, semantic correctness, consistency/relationships between multiple data elements, appropriate type, etc.  Our analysis of CWE-20 issues for the Top 25 will effectively provide us with some real-world examples that we can look at to see if these kinds of splits make sense, as will learning about what code analysis tools examine  and currently label as CWE-20.

 

- Steve

 

 

From: Kurt Seifried <[hidden email]>
Sent: Tuesday, May 19, 2020 11:03 AM
To: Steven M Christey <[hidden email]>
Cc: CWE Research Discussion <[hidden email]>
Subject: [EXT] Re: Update on CWE-20 (improper input validation) and other difficult CWEs

 

I would like to see CWE-20 split out into subtypes at a minimum. What kind of improper validation? Assumption about the data? Flaw in the validation itself? Data types? Data format? Data content? Data length (truncation and too long)?

 

On Tue, May 19, 2020 at 8:31 AM Steven M Christey <[hidden email]> wrote:

All,

We have been more closely investigating CWE-20 and its apparent inconsistencies with the validation-oriented category CWE-1215.  We will make  appropriate modifications in CWE 4.1, which we hope to release in the early summer.  At the very least, CWE-20 needs to be modified so that CWE-1215 is not seen as a viable alternative by consumers. Generally, category CWEs should not be used for mapping, except perhaps in rollup summaries for code analysis tools. The need to use a category instead of a weakness indicates some kind of problem that needs to be addressed.

Since we are also working on this year's CWE Top 25, we will be taking special note of the reasons why CWE-20 is used so heavily in NVD.  We will investigate how often "improper input validation" is the only weakness information given by the original vendor; when it's effectively used as a placeholder for rare or product-specific weaknesses that are difficult to analyze, suggesting opportunities for training and guidance; when it indicates gaps or usability problems in CWE itself; and other possible reasons.

We believe that some commonly-used, class-level CWE entries may be used too often because the entries themselves are not well organized.  Some other class-level entries on the Top 25 may have similar problems, such as CWE-284 improper access control. People have spent decades studying and classifying buffer overflows and injection problems, but some parts of CWE could benefit from more targeted research, including CWE-20.

If you have your own critiques of CWE-20 itself, or how it's used by others, please let us know, either on this list or privately. We also welcome any suggestions on how it could be improved.

We will keep you informed of our progress over the coming weeks.


Thank you,
Steve Christey Coley

CWE Technical Lead

 


 

--

Kurt Seifried
[hidden email]



--
Kurt Seifried
[hidden email]
Reply | Threaded
Open this post in threaded view
|

Re: [EXT] Re: Update on CWE-20 (improper input validation) and other difficult CWEs

Steve Overland
I agree we should publish the most practical guidance possible.

Is this the appropriate forum to elaborate on the sub-types?

I have always found reliably validating file types to challenging.



On Wed., May 20, 2020, 1:35 p.m. Kurt Seifried, <[hidden email]> wrote:
I mean in theory everything could be caught by some sort of input validation (e.g. if you know the max size a buffer should be you could add a check to ensure it isn't over that and avoid a buffer overflow). That's not helpful though, it's like saying "the secret to being rich is to make more money than you spend". I think the only real option is to add subtypes and then see if they need to be/can be split out into new things. 

On Wed, May 20, 2020 at 2:11 PM Steve Overland <[hidden email]> wrote:
This will be an interesting discussion I'm sure.

I am about to submit an input validation training session proposal for the upcoming OWASP app sec virtual conference.  A deep dive into character encoding.

Offhand comment...  I think a big reason CWE-20 would be referenced so often is...  well...  what percentage of attacks are NOT input-based?

- Steve



On Tue, May 19, 2020 at 12:20 PM Steven M Christey <[hidden email]> wrote:

Kurt,

 

In terms of splitting out CWE-20, you’re suggesting something similar to what I’ve been thinking in terms of splitting out CWE-20, which already has various children.  One could imagine types of validation related to size or quantity, position or index, syntactic correctness, semantic correctness, consistency/relationships between multiple data elements, appropriate type, etc.  Our analysis of CWE-20 issues for the Top 25 will effectively provide us with some real-world examples that we can look at to see if these kinds of splits make sense, as will learning about what code analysis tools examine  and currently label as CWE-20.

 

- Steve

 

 

From: Kurt Seifried <[hidden email]>
Sent: Tuesday, May 19, 2020 11:03 AM
To: Steven M Christey <[hidden email]>
Cc: CWE Research Discussion <[hidden email]>
Subject: [EXT] Re: Update on CWE-20 (improper input validation) and other difficult CWEs

 

I would like to see CWE-20 split out into subtypes at a minimum. What kind of improper validation? Assumption about the data? Flaw in the validation itself? Data types? Data format? Data content? Data length (truncation and too long)?

 

On Tue, May 19, 2020 at 8:31 AM Steven M Christey <[hidden email]> wrote:

All,

We have been more closely investigating CWE-20 and its apparent inconsistencies with the validation-oriented category CWE-1215.  We will make  appropriate modifications in CWE 4.1, which we hope to release in the early summer.  At the very least, CWE-20 needs to be modified so that CWE-1215 is not seen as a viable alternative by consumers. Generally, category CWEs should not be used for mapping, except perhaps in rollup summaries for code analysis tools. The need to use a category instead of a weakness indicates some kind of problem that needs to be addressed.

Since we are also working on this year's CWE Top 25, we will be taking special note of the reasons why CWE-20 is used so heavily in NVD.  We will investigate how often "improper input validation" is the only weakness information given by the original vendor; when it's effectively used as a placeholder for rare or product-specific weaknesses that are difficult to analyze, suggesting opportunities for training and guidance; when it indicates gaps or usability problems in CWE itself; and other possible reasons.

We believe that some commonly-used, class-level CWE entries may be used too often because the entries themselves are not well organized.  Some other class-level entries on the Top 25 may have similar problems, such as CWE-284 improper access control. People have spent decades studying and classifying buffer overflows and injection problems, but some parts of CWE could benefit from more targeted research, including CWE-20.

If you have your own critiques of CWE-20 itself, or how it's used by others, please let us know, either on this list or privately. We also welcome any suggestions on how it could be improved.

We will keep you informed of our progress over the coming weeks.


Thank you,
Steve Christey Coley

CWE Technical Lead

 


 

--

Kurt Seifried
[hidden email]



--
Kurt Seifried
[hidden email]
Reply | Threaded
Open this post in threaded view
|

RE: [EXT] Re: Update on CWE-20 (improper input validation) and other difficult CWEs

Christey, Steven M.
In reply to this post by Steve Overland

I agree that many attacks are input-based, but they’re also often output-based.  One of the areas I’m wrestling with is that many people consider injection-related issues to be only input validation problems, instead of poor output encoding or similar issues.  The classic example I use is how you can’t enter the surname “O’Reilly” into a database using only input validation, yet SQLi and XSS are commonly thought of as “input validation problems.”  We try to distinguish CWE-20 from injection in terms of CanPrecede relationships, but this may need to be spelled out more clearly.

 

I’m not sure how fixable that kind of user confusion is on CWE’s part, but I also worry about the indirect implication that every input problem is solved just by checking the input. Consider long-running OWASP guidance to effectively avoid IDOR-like problems by mapping a reference ID to a hard-coded (or programmer-controlled) value.  That’s not “input validation” per se, as you’re not even processing the input directly any more.  The CWE Vulnerability Theory document refers to this as “indirect selection.”

 

The “input validation” term itself may be part of the issue, since it’s interpreted in so many different ways by different people.  Consider improper certificate validation by not following the chain of trust.  That’s not exactly “input” in the sense of data being directly entered into a program to be processed, yet there is “validation.”

 

But the challenges with CWE-20 and other entries is multi-faceted.  We effectively have a use case where we need to make it easier for people to map to the correct or best CWE IDs, but we also separately have a use case that teaches developers, and addressing one use case won’t necessarily fix the other.

 

- Steve

 

 

 

From: Steve Overland <[hidden email]>
Sent: Wednesday, May 20, 2020 4:11 PM
To: Steven M Christey <[hidden email]>
Cc: Seifried, Kurt <[hidden email]>; CWE Research Discussion <[hidden email]>
Subject: Re: [EXT] Re: Update on CWE-20 (improper input validation) and other difficult CWEs

 

This will be an interesting discussion I'm sure.

 

I am about to submit an input validation training session proposal for the upcoming OWASP app sec virtual conference.  A deep dive into character encoding.

 

Offhand comment...  I think a big reason CWE-20 would be referenced so often is...  well...  what percentage of attacks are NOT input-based?

 

- Steve

 

 

 

On Tue, May 19, 2020 at 12:20 PM Steven M Christey <[hidden email]> wrote:

Kurt,

 

In terms of splitting out CWE-20, you’re suggesting something similar to what I’ve been thinking in terms of splitting out CWE-20, which already has various children.  One could imagine types of validation related to size or quantity, position or index, syntactic correctness, semantic correctness, consistency/relationships between multiple data elements, appropriate type, etc.  Our analysis of CWE-20 issues for the Top 25 will effectively provide us with some real-world examples that we can look at to see if these kinds of splits make sense, as will learning about what code analysis tools examine  and currently label as CWE-20.

 

- Steve

 

 

From: Kurt Seifried <[hidden email]>
Sent: Tuesday, May 19, 2020 11:03 AM
To: Steven M Christey <[hidden email]>
Cc: CWE Research Discussion <[hidden email]>
Subject: [EXT] Re: Update on CWE-20 (improper input validation) and other difficult CWEs

 

I would like to see CWE-20 split out into subtypes at a minimum. What kind of improper validation? Assumption about the data? Flaw in the validation itself? Data types? Data format? Data content? Data length (truncation and too long)?

 

On Tue, May 19, 2020 at 8:31 AM Steven M Christey <[hidden email]> wrote:

All,

We have been more closely investigating CWE-20 and its apparent inconsistencies with the validation-oriented category CWE-1215.  We will make  appropriate modifications in CWE 4.1, which we hope to release in the early summer.  At the very least, CWE-20 needs to be modified so that CWE-1215 is not seen as a viable alternative by consumers. Generally, category CWEs should not be used for mapping, except perhaps in rollup summaries for code analysis tools. The need to use a category instead of a weakness indicates some kind of problem that needs to be addressed.

Since we are also working on this year's CWE Top 25, we will be taking special note of the reasons why CWE-20 is used so heavily in NVD.  We will investigate how often "improper input validation" is the only weakness information given by the original vendor; when it's effectively used as a placeholder for rare or product-specific weaknesses that are difficult to analyze, suggesting opportunities for training and guidance; when it indicates gaps or usability problems in CWE itself; and other possible reasons.

We believe that some commonly-used, class-level CWE entries may be used too often because the entries themselves are not well organized.  Some other class-level entries on the Top 25 may have similar problems, such as CWE-284 improper access control. People have spent decades studying and classifying buffer overflows and injection problems, but some parts of CWE could benefit from more targeted research, including CWE-20.

If you have your own critiques of CWE-20 itself, or how it's used by others, please let us know, either on this list or privately. We also welcome any suggestions on how it could be improved.

We will keep you informed of our progress over the coming weeks.


Thank you,
Steve Christey Coley

CWE Technical Lead

 


 

--

Kurt Seifried
[hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: [EXT] Re: Update on CWE-20 (improper input validation) and other difficult CWEs

Steve Overland
Steven I think your cert example highlights that input validation is a trust issue. You are trusting data to be something it might not be. More validation earns more trust.

I would distinguish the cert example as PKI authenticity trust validation. But yes the certs are inputs into the system.

On Wed., May 20, 2020, 2:16 p.m. David Jacobs, <[hidden email]> wrote:
Please remove me!  

I have no idea why I get these and I have tried to make them stop for years 

Sent from my iPhone

On May 20, 2020, at 4:09 PM, Steven M Christey <[hidden email]> wrote:



I agree that many attacks are input-based, but they’re also often output-based.  One of the areas I’m wrestling with is that many people consider injection-related issues to be only input validation problems, instead of poor output encoding or similar issues.  The classic example I use is how you can’t enter the surname “O’Reilly” into a database using only input validation, yet SQLi and XSS are commonly thought of as “input validation problems.”  We try to distinguish CWE-20 from injection in terms of CanPrecede relationships, but this may need to be spelled out more clearly.

 

I’m not sure how fixable that kind of user confusion is on CWE’s part, but I also worry about the indirect implication that every input problem is solved just by checking the input. Consider long-running OWASP guidance to effectively avoid IDOR-like problems by mapping a reference ID to a hard-coded (or programmer-controlled) value.  That’s not “input validation” per se, as you’re not even processing the input directly any more.  The CWE Vulnerability Theory document refers to this as “indirect selection.”

 

The “input validation” term itself may be part of the issue, since it’s interpreted in so many different ways by different people.  Consider improper certificate validation by not following the chain of trust.  That’s not exactly “input” in the sense of data being directly entered into a program to be processed, yet there is “validation.”

 

But the challenges with CWE-20 and other entries is multi-faceted.  We effectively have a use case where we need to make it easier for people to map to the correct or best CWE IDs, but we also separately have a use case that teaches developers, and addressing one use case won’t necessarily fix the other.

 

- Steve

 

 

 

From: Steve Overland <[hidden email]>
Sent: Wednesday, May 20, 2020 4:11 PM
To: Steven M Christey <[hidden email]>
Cc: Seifried, Kurt <[hidden email]>; CWE Research Discussion <[hidden email]>
Subject: Re: [EXT] Re: Update on CWE-20 (improper input validation) and other difficult CWEs

 

This will be an interesting discussion I'm sure.

 

I am about to submit an input validation training session proposal for the upcoming OWASP app sec virtual conference.  A deep dive into character encoding.

 

Offhand comment...  I think a big reason CWE-20 would be referenced so often is...  well...  what percentage of attacks are NOT input-based?

 

- Steve

 

 

 

On Tue, May 19, 2020 at 12:20 PM Steven M Christey <[hidden email]> wrote:

Kurt,

 

In terms of splitting out CWE-20, you’re suggesting something similar to what I’ve been thinking in terms of splitting out CWE-20, which already has various children.  One could imagine types of validation related to size or quantity, position or index, syntactic correctness, semantic correctness, consistency/relationships between multiple data elements, appropriate type, etc.  Our analysis of CWE-20 issues for the Top 25 will effectively provide us with some real-world examples that we can look at to see if these kinds of splits make sense, as will learning about what code analysis tools examine  and currently label as CWE-20.

 

- Steve

 

 

From: Kurt Seifried <[hidden email]>
Sent: Tuesday, May 19, 2020 11:03 AM
To: Steven M Christey <[hidden email]>
Cc: CWE Research Discussion <[hidden email]>
Subject: [EXT] Re: Update on CWE-20 (improper input validation) and other difficult CWEs

 

I would like to see CWE-20 split out into subtypes at a minimum. What kind of improper validation? Assumption about the data? Flaw in the validation itself? Data types? Data format? Data content? Data length (truncation and too long)?

 

On Tue, May 19, 2020 at 8:31 AM Steven M Christey <[hidden email]> wrote:

All,

We have been more closely investigating CWE-20 and its apparent inconsistencies with the validation-oriented category CWE-1215.  We will make  appropriate modifications in CWE 4.1, which we hope to release in the early summer.  At the very least, CWE-20 needs to be modified so that CWE-1215 is not seen as a viable alternative by consumers. Generally, category CWEs should not be used for mapping, except perhaps in rollup summaries for code analysis tools. The need to use a category instead of a weakness indicates some kind of problem that needs to be addressed.

Since we are also working on this year's CWE Top 25, we will be taking special note of the reasons why CWE-20 is used so heavily in NVD.  We will investigate how often "improper input validation" is the only weakness information given by the original vendor; when it's effectively used as a placeholder for rare or product-specific weaknesses that are difficult to analyze, suggesting opportunities for training and guidance; when it indicates gaps or usability problems in CWE itself; and other possible reasons.

We believe that some commonly-used, class-level CWE entries may be used too often because the entries themselves are not well organized.  Some other class-level entries on the Top 25 may have similar problems, such as CWE-284 improper access control. People have spent decades studying and classifying buffer overflows and injection problems, but some parts of CWE could benefit from more targeted research, including CWE-20.

If you have your own critiques of CWE-20 itself, or how it's used by others, please let us know, either on this list or privately. We also welcome any suggestions on how it could be improved.

We will keep you informed of our progress over the coming weeks.


Thank you,
Steve Christey Coley

CWE Technical Lead

 


 

--

Kurt Seifried
[hidden email]

Reply | Threaded
Open this post in threaded view
|

RE: RE: [EXT] Re: Update on CWE-20 (improper input validation) and other difficult CWEs

Arthur Hicken
In reply to this post by Christey, Steven M.

I agree. In a world of dynamic web-based UI, output IS input.

 

Sent from Mail for Windows 10

 

From: [hidden email]
Sent: Wednesday, May 20, 2020 2:09 PM
To: [hidden email]
Cc: [hidden email]; [hidden email]
Subject: RE: [EXT] Re: Update on CWE-20 (improper input validation) and other difficult CWEs

 

I agree that many attacks are input-based, but they’re also often output-based.  One of the areas I’m wrestling with is that many people consider injection-related issues to be only input validation problems, instead of poor output encoding or similar issues.  The classic example I use is how you can’t enter the surname “O’Reilly” into a database using only input validation, yet SQLi and XSS are commonly thought of as “input validation problems.”  We try to distinguish CWE-20 from injection in terms of CanPrecede relationships, but this may need to be spelled out more clearly.

 

I’m not sure how fixable that kind of user confusion is on CWE’s part, but I also worry about the indirect implication that every input problem is solved just by checking the input. Consider long-running OWASP guidance to effectively avoid IDOR-like problems by mapping a reference ID to a hard-coded (or programmer-controlled) value.  That’s not “input validation” per se, as you’re not even processing the input directly any more.  The CWE Vulnerability Theory document refers to this as “indirect selection.”

 

The “input validation” term itself may be part of the issue, since it’s interpreted in so many different ways by different people.  Consider improper certificate validation by not following the chain of trust.  That’s not exactly “input” in the sense of data being directly entered into a program to be processed, yet there is “validation.”

 

But the challenges with CWE-20 and other entries is multi-faceted.  We effectively have a use case where we need to make it easier for people to map to the correct or best CWE IDs, but we also separately have a use case that teaches developers, and addressing one use case won’t necessarily fix the other.

 

- Steve

 

 

 

From: Steve Overland <[hidden email]>
Sent: Wednesday, May 20, 2020 4:11 PM
To: Steven M Christey <[hidden email]>
Cc: Seifried, Kurt <[hidden email]>; CWE Research Discussion <[hidden email]>
Subject: Re: [EXT] Re: Update on CWE-20 (improper input validation) and other difficult CWEs

 

This will be an interesting discussion I'm sure.

 

I am about to submit an input validation training session proposal for the upcoming OWASP app sec virtual conference.  A deep dive into character encoding.

 

Offhand comment...  I think a big reason CWE-20 would be referenced so often is...  well...  what percentage of attacks are NOT input-based?

 

- Steve

 

 

 

On Tue, May 19, 2020 at 12:20 PM Steven M Christey <[hidden email]> wrote:

Kurt,

 

In terms of splitting out CWE-20, you’re suggesting something similar to what I’ve been thinking in terms of splitting out CWE-20, which already has various children.  One could imagine types of validation related to size or quantity, position or index, syntactic correctness, semantic correctness, consistency/relationships between multiple data elements, appropriate type, etc.  Our analysis of CWE-20 issues for the Top 25 will effectively provide us with some real-world examples that we can look at to see if these kinds of splits make sense, as will learning about what code analysis tools examine  and currently label as CWE-20.

 

- Steve

 

 

From: Kurt Seifried <[hidden email]>
Sent: Tuesday, May 19, 2020 11:03 AM
To: Steven M Christey <[hidden email]>
Cc: CWE Research Discussion <[hidden email]>
Subject: [EXT] Re: Update on CWE-20 (improper input validation) and other difficult CWEs

 

I would like to see CWE-20 split out into subtypes at a minimum. What kind of improper validation? Assumption about the data? Flaw in the validation itself? Data types? Data format? Data content? Data length (truncation and too long)?

 

On Tue, May 19, 2020 at 8:31 AM Steven M Christey <[hidden email]> wrote:

All,

We have been more closely investigating CWE-20 and its apparent inconsistencies with the validation-oriented category CWE-1215.  We will make  appropriate modifications in CWE 4.1, which we hope to release in the early summer.  At the very least, CWE-20 needs to be modified so that CWE-1215 is not seen as a viable alternative by consumers. Generally, category CWEs should not be used for mapping, except perhaps in rollup summaries for code analysis tools. The need to use a category instead of a weakness indicates some kind of problem that needs to be addressed.

Since we are also working on this year's CWE Top 25, we will be taking special note of the reasons why CWE-20 is used so heavily in NVD.  We will investigate how often "improper input validation" is the only weakness information given by the original vendor; when it's effectively used as a placeholder for rare or product-specific weaknesses that are difficult to analyze, suggesting opportunities for training and guidance; when it indicates gaps or usability problems in CWE itself; and other possible reasons.

We believe that some commonly-used, class-level CWE entries may be used too often because the entries themselves are not well organized.  Some other class-level entries on the Top 25 may have similar problems, such as CWE-284 improper access control. People have spent decades studying and classifying buffer overflows and injection problems, but some parts of CWE could benefit from more targeted research, including CWE-20.

If you have your own critiques of CWE-20 itself, or how it's used by others, please let us know, either on this list or privately. We also welcome any suggestions on how it could be improved.

We will keep you informed of our progress over the coming weeks.


Thank you,
Steve Christey Coley

CWE Technical Lead

 


 

--

Kurt Seifried
[hidden email]

 

Reply | Threaded
Open this post in threaded view
|

Re: RE: [EXT] Re: Update on CWE-20 (improper input validation) and other difficult CWEs

Kurt Seifried
More correctly: your output goes to yet another input. By definition output goes to something or why else would it exist? That processing may be delayed (e.g. writing a log file that gets ignored forever until an incident happens)  but there is obviously some intent to later use the data otherwise you'd simply delete it(*)

* Except in cases where deleting it may actually be more costly than simply storing it long term.

On Wed, May 20, 2020 at 4:05 PM Arthur “Code Curmudgeon” Hicken <[hidden email]> wrote:

I agree. In a world of dynamic web-based UI, output IS input.

 

Sent from Mail for Windows 10

 

From: [hidden email]
Sent: Wednesday, May 20, 2020 2:09 PM
To: [hidden email]
Cc: [hidden email]; [hidden email]
Subject: RE: [EXT] Re: Update on CWE-20 (improper input validation) and other difficult CWEs

 

I agree that many attacks are input-based, but they’re also often output-based.  One of the areas I’m wrestling with is that many people consider injection-related issues to be only input validation problems, instead of poor output encoding or similar issues.  The classic example I use is how you can’t enter the surname “O’Reilly” into a database using only input validation, yet SQLi and XSS are commonly thought of as “input validation problems.”  We try to distinguish CWE-20 from injection in terms of CanPrecede relationships, but this may need to be spelled out more clearly.

 

I’m not sure how fixable that kind of user confusion is on CWE’s part, but I also worry about the indirect implication that every input problem is solved just by checking the input. Consider long-running OWASP guidance to effectively avoid IDOR-like problems by mapping a reference ID to a hard-coded (or programmer-controlled) value.  That’s not “input validation” per se, as you’re not even processing the input directly any more.  The CWE Vulnerability Theory document refers to this as “indirect selection.”

 

The “input validation” term itself may be part of the issue, since it’s interpreted in so many different ways by different people.  Consider improper certificate validation by not following the chain of trust.  That’s not exactly “input” in the sense of data being directly entered into a program to be processed, yet there is “validation.”

 

But the challenges with CWE-20 and other entries is multi-faceted.  We effectively have a use case where we need to make it easier for people to map to the correct or best CWE IDs, but we also separately have a use case that teaches developers, and addressing one use case won’t necessarily fix the other.

 

- Steve

 

 

 

From: Steve Overland <[hidden email]>
Sent: Wednesday, May 20, 2020 4:11 PM
To: Steven M Christey <[hidden email]>
Cc: Seifried, Kurt <[hidden email]>; CWE Research Discussion <[hidden email]>
Subject: Re: [EXT] Re: Update on CWE-20 (improper input validation) and other difficult CWEs

 

This will be an interesting discussion I'm sure.

 

I am about to submit an input validation training session proposal for the upcoming OWASP app sec virtual conference.  A deep dive into character encoding.

 

Offhand comment...  I think a big reason CWE-20 would be referenced so often is...  well...  what percentage of attacks are NOT input-based?

 

- Steve

 

 

 

On Tue, May 19, 2020 at 12:20 PM Steven M Christey <[hidden email]> wrote:

Kurt,

 

In terms of splitting out CWE-20, you’re suggesting something similar to what I’ve been thinking in terms of splitting out CWE-20, which already has various children.  One could imagine types of validation related to size or quantity, position or index, syntactic correctness, semantic correctness, consistency/relationships between multiple data elements, appropriate type, etc.  Our analysis of CWE-20 issues for the Top 25 will effectively provide us with some real-world examples that we can look at to see if these kinds of splits make sense, as will learning about what code analysis tools examine  and currently label as CWE-20.

 

- Steve

 

 

From: Kurt Seifried <[hidden email]>
Sent: Tuesday, May 19, 2020 11:03 AM
To: Steven M Christey <[hidden email]>
Cc: CWE Research Discussion <[hidden email]>
Subject: [EXT] Re: Update on CWE-20 (improper input validation) and other difficult CWEs

 

I would like to see CWE-20 split out into subtypes at a minimum. What kind of improper validation? Assumption about the data? Flaw in the validation itself? Data types? Data format? Data content? Data length (truncation and too long)?

 

On Tue, May 19, 2020 at 8:31 AM Steven M Christey <[hidden email]> wrote:

All,

We have been more closely investigating CWE-20 and its apparent inconsistencies with the validation-oriented category CWE-1215.  We will make  appropriate modifications in CWE 4.1, which we hope to release in the early summer.  At the very least, CWE-20 needs to be modified so that CWE-1215 is not seen as a viable alternative by consumers. Generally, category CWEs should not be used for mapping, except perhaps in rollup summaries for code analysis tools. The need to use a category instead of a weakness indicates some kind of problem that needs to be addressed.

Since we are also working on this year's CWE Top 25, we will be taking special note of the reasons why CWE-20 is used so heavily in NVD.  We will investigate how often "improper input validation" is the only weakness information given by the original vendor; when it's effectively used as a placeholder for rare or product-specific weaknesses that are difficult to analyze, suggesting opportunities for training and guidance; when it indicates gaps or usability problems in CWE itself; and other possible reasons.

We believe that some commonly-used, class-level CWE entries may be used too often because the entries themselves are not well organized.  Some other class-level entries on the Top 25 may have similar problems, such as CWE-284 improper access control. People have spent decades studying and classifying buffer overflows and injection problems, but some parts of CWE could benefit from more targeted research, including CWE-20.

If you have your own critiques of CWE-20 itself, or how it's used by others, please let us know, either on this list or privately. We also welcome any suggestions on how it could be improved.

We will keep you informed of our progress over the coming weeks.


Thank you,
Steve Christey Coley

CWE Technical Lead

 


 

--

Kurt Seifried
[hidden email]

 



--
Kurt Seifried
[hidden email]
Reply | Threaded
Open this post in threaded view
|

Re: [EXT] Re: Update on CWE-20 (improper input validation) and other difficult CWEs

Jukka Ruohonen
In reply to this post by Kurt Seifried
On Tue, May 19, 2020 at 09:03:13AM -0600, Kurt Seifried wrote:
>    I would like to see CWE-20 split out into subtypes at a minimum. What
>    kind of improper validation? Assumption about the data? Flaw in the
>    validation itself? Data types? Data format? Data content? Data length
>    (truncation and too long)?

I think something like this would indeed be worthwhile in terms of
education.  When teaching undergraduate computer science students, I
increasingly try to point them toward CWE due to the recent improvements
with concrete code examples. However, things like CWE-20 are indeed
difficult also pedagogically. So I tend to resort to CWE-89 or something
similar, like I'd suspects most teachers would do.

Generally, it would be useful if this kind of a tree would demonstrate how
the eventual bug is incrementally "constructed". Here, I am thinking about
some kind of a flow chart with concrete code snippets.

Of course, use cases vary, i.e. just a few cents from academia.

- Jukka