All, CWE Technical Lead |
I would like to see CWE-20 split out into subtypes at a minimum. What kind of improper validation? Assumption about the data? Flaw in the validation itself? Data types? Data format? Data content? Data length (truncation and too long)? On Tue, May 19, 2020 at 8:31 AM Steven M Christey <[hidden email]> wrote:
Kurt Seifried
[hidden email] |
In reply to this post by Steven M Christey
Kurt, In terms of splitting out CWE-20, you’re suggesting something similar to what I’ve been thinking in terms of splitting out CWE-20, which already has various children. One could imagine types of validation related to size or quantity, position
or index, syntactic correctness, semantic correctness, consistency/relationships between multiple data elements, appropriate type, etc. Our analysis of CWE-20 issues for the Top 25 will effectively provide us with some real-world examples that we can look
at to see if these kinds of splits make sense, as will learning about what code analysis tools examine and currently label as CWE-20. - Steve From: Kurt Seifried <[hidden email]> I would like to see CWE-20 split out into subtypes at a minimum. What kind of improper validation? Assumption about the data? Flaw in the validation itself? Data types? Data format? Data content? Data length (truncation and too long)? On Tue, May 19, 2020 at 8:31 AM Steven M Christey <[hidden email]> wrote:
-- Kurt Seifried |
This will be an interesting discussion I'm sure. I am about to submit an input validation training session proposal for the upcoming OWASP app sec virtual conference. A deep dive into character encoding. Offhand comment... I think a big reason CWE-20 would be referenced so often is... well... what percentage of attacks are NOT input-based? - Steve On Tue, May 19, 2020 at 12:20 PM Steven M Christey <[hidden email]> wrote:
|
I mean in theory everything could be caught by some sort of input validation (e.g. if you know the max size a buffer should be you could add a check to ensure it isn't over that and avoid a buffer overflow). That's not helpful though, it's like saying "the secret to being rich is to make more money than you spend". I think the only real option is to add subtypes and then see if they need to be/can be split out into new things. On Wed, May 20, 2020 at 2:11 PM Steve Overland <[hidden email]> wrote:
Kurt Seifried
[hidden email] |
I agree we should publish the most practical guidance possible. Is this the appropriate forum to elaborate on the sub-types? I have always found reliably validating file types to challenging. On Wed., May 20, 2020, 1:35 p.m. Kurt Seifried, <[hidden email]> wrote:
|
In reply to this post by Steve Overland
I agree that many attacks are input-based, but they’re also often output-based. One of the areas I’m wrestling with is that many people consider injection-related issues to be only input validation problems, instead of poor output encoding
or similar issues. The classic example I use is how you can’t enter the surname “O’Reilly” into a database using only input validation, yet SQLi and XSS are commonly thought of as “input validation problems.” We try to distinguish CWE-20 from injection in
terms of CanPrecede relationships, but this may need to be spelled out more clearly. I’m not sure how fixable that kind of user confusion is on CWE’s part, but I also worry about the indirect implication that every input problem is solved just by checking the input. Consider long-running OWASP guidance to effectively avoid
IDOR-like problems by mapping a reference ID to a hard-coded (or programmer-controlled) value. That’s not “input validation” per se, as you’re not even processing the input directly any more. The CWE Vulnerability Theory document refers to this as “indirect
selection.” The “input validation” term itself may be part of the issue, since it’s interpreted in so many different ways by different people. Consider improper certificate validation by not following the chain of trust. That’s not exactly “input”
in the sense of data being directly entered into a program to be processed, yet there is “validation.” But the challenges with CWE-20 and other entries is multi-faceted. We effectively have a use case where we need to make it easier for people to map to the correct or best CWE IDs, but we also separately have a use case that teaches developers,
and addressing one use case won’t necessarily fix the other. - Steve From: Steve Overland <[hidden email]> This will be an interesting discussion I'm sure. I am about to submit an input validation training session proposal for the upcoming OWASP app sec virtual conference. A deep dive into character encoding. Offhand comment... I think a big reason CWE-20 would be referenced so often is... well... what percentage of attacks are NOT input-based? - Steve On Tue, May 19, 2020 at 12:20 PM Steven M Christey <[hidden email]> wrote:
|
Steven I think your cert example highlights that input validation is a trust issue. You are trusting data to be something it might not be. More validation earns more trust. I would distinguish the cert example as PKI authenticity trust validation. But yes the certs are inputs into the system. On Wed., May 20, 2020, 2:16 p.m. David Jacobs, <[hidden email]> wrote:
|
In reply to this post by Steven M Christey
I agree. In a world of dynamic web-based UI, output IS input. Sent from Mail for Windows 10 From: [hidden email] I agree that many attacks are input-based, but they’re also often output-based. One of the areas I’m wrestling with is that many people consider injection-related issues to be only input validation problems, instead of poor output encoding or similar issues. The classic example I use is how you can’t enter the surname “O’Reilly” into a database using only input validation, yet SQLi and XSS are commonly thought of as “input validation problems.” We try to distinguish CWE-20 from injection in terms of CanPrecede relationships, but this may need to be spelled out more clearly. I’m not sure how fixable that kind of user confusion is on CWE’s part, but I also worry about the indirect implication that every input problem is solved just by checking the input. Consider long-running OWASP guidance to effectively avoid IDOR-like problems by mapping a reference ID to a hard-coded (or programmer-controlled) value. That’s not “input validation” per se, as you’re not even processing the input directly any more. The CWE Vulnerability Theory document refers to this as “indirect selection.” The “input validation” term itself may be part of the issue, since it’s interpreted in so many different ways by different people. Consider improper certificate validation by not following the chain of trust. That’s not exactly “input” in the sense of data being directly entered into a program to be processed, yet there is “validation.” But the challenges with CWE-20 and other entries is multi-faceted. We effectively have a use case where we need to make it easier for people to map to the correct or best CWE IDs, but we also separately have a use case that teaches developers, and addressing one use case won’t necessarily fix the other. - Steve From: Steve Overland <[hidden email]> This will be an interesting discussion I'm sure. I am about to submit an input validation training session proposal for the upcoming OWASP app sec virtual conference. A deep dive into character encoding. Offhand comment... I think a big reason CWE-20 would be referenced so often is... well... what percentage of attacks are NOT input-based? - Steve On Tue, May 19, 2020 at 12:20 PM Steven M Christey <[hidden email]> wrote:
Kurt Seifried |
More correctly: your output goes to yet another input. By definition output goes to something or why else would it exist? That processing may be delayed (e.g. writing a log file that gets ignored forever until an incident happens) but there is obviously some intent to later use the data otherwise you'd simply delete it(*) * Except in cases where deleting it may actually be more costly than simply storing it long term. On Wed, May 20, 2020 at 4:05 PM Arthur “Code Curmudgeon” Hicken <[hidden email]> wrote:
Kurt Seifried
[hidden email] |
In reply to this post by Kurt Seifried
On Tue, May 19, 2020 at 09:03:13AM -0600, Kurt Seifried wrote:
> I would like to see CWE-20 split out into subtypes at a minimum. What > kind of improper validation? Assumption about the data? Flaw in the > validation itself? Data types? Data format? Data content? Data length > (truncation and too long)? I think something like this would indeed be worthwhile in terms of education. When teaching undergraduate computer science students, I increasingly try to point them toward CWE due to the recent improvements with concrete code examples. However, things like CWE-20 are indeed difficult also pedagogically. So I tend to resort to CWE-89 or something similar, like I'd suspects most teachers would do. Generally, it would be useful if this kind of a tree would demonstrate how the eventual bug is incrementally "constructed". Here, I am thinking about some kind of a flow chart with concrete code snippets. Of course, use cases vary, i.e. just a few cents from academia. - Jukka |
Free forum by Nabble | Edit this page |