In this section we'll discuss in detail how to prepare your organization to launch and run a successful vulnerability disclosure program, including practical advice on how to fill in the gaps you've identified.
Augmenting your existing security program with outside security researchers is a great way to find complex issues and obscure vulnerabilities. However, using a VDP to find basic security problems that could be discovered internally is a waste of resources.
When it comes to finding bugs, the only way to know where to start is if you have a good idea of what's out there. You can buy a hundred security tools, but it won't make any difference if teams are standing up applications, systems, and services ad hoc without your knowledge, especially if you don't have a way to discover and perform security assessments against these assets. Check with the individuals and teams that are responsible for helping stand up new applications, systems, and services to see if they have a process in place for creating and maintaining an inventory of what's getting spun up and who owns it. If there isn't a current process, this is a great opportunity to collaborate with these teams to build one. Gaining an understanding of your organization's assets is the best place to start when identifying your attack surface. As part of this process, the security team should be involved in the development of new infrastructure implementation to provide security reviews. It is good practice to have an extensive inventory of assets and owners. This kind of inventory is useful when applying new patches that require certain systems to be temporarily shut down. It provides a road map of individuals or teams that need to be informed and which systems are affected. Having a robust asset management process in place ensures owners are identified earlier in the process, updated regularly, and that all systems across the organization operate as intended.
In addition to proactive asset management, consider what reactive measures you can implement as well to identify assets that belong to your organization, but slipped through the cracks of your standardized asset management processes. This can include using the same "reconnaissance" processes used by security researchers that participate in VDPs and bug bounty programs. For example, you can leverage free and open source tools that scan and enumerate Internet-facing IP ranges or domains that may belong to your organization. A Google search for bug bounty recon will produce a variety of tips and tricks to help you identify assets from your organization you were unaware of.
Basic vulnerability scanning
Now that you have a solid foundation of where you need to find security issues, let's dive into how you actually do that. There are various levels of depth that you can go into depending on the resources of your organization but you need to find a balance between your internal security efforts and the external hacking community through your vulnerability disclosure program. This balance is different for every organization, depending on the resources available.
Choose your tools
There are many different tools to help with identifying vulnerabilities. Some vulnerability scanning tools are available for free, while others come at a cost. Figuring out what tools to pick depends on your individual needs.
- Your organization's requirements
- How well each tool satisfies these requirements
- If the benefits of the tool outweigh the costs (financial and implementation).
You can use this requirements template (OpenDocument .ods, Microsoft Excel .xlsx) to evaluate various tools against your requirements. Some example requirements are included in the template, but you should discuss with your security, IT, and engineering teams to align on required capabilities. Before launching a vulnerability disclosure program, at a minimum, you'll want to be able to perform vulnerability scans against any externally facing assets (such as websites, APIs, mobile apps). This will help you find and fix easily discoverable vulnerabilities before you invite external security researchers to test against these assets and services.
Automated vulnerability scans can find a lot of issues, but they can also produce false positives. That is why it is necessary to have resources to validate the results before sharing them with the impacted teams. You'll need to implement processes to ensure scans are run on a regular basis, and that the results of these scans are actually addressed. This will look different for every organization, but at a minimum, you'll want to determine:
- Scan frequency
- Which assets are being scanned
- Authenticated vs. unauthenticated scans
- (hint: if you do not scan with credentials, then a security researcher tests with credentials when you start your VDP, you might get a large spike of identified vulnerabilities)
- Roles and responsibilities
- Identify the team members responsible for running the scans
- Set up a rotation if necessary
- Scan results
- Verifying scan results
- Filing bugs for verified vulnerabilities
- Identifying owners to fix bugs
- Following up with owners on remediation
We'll go into more detail how to ensure identified security issues are fixed in the Fixing Bugs section later in this guide.
Security review process
While vulnerability scanning is a great way to reactively identify security issues in your organization, implementing security review processes can help prevent vulnerabilities from being introduced in the first place. For the purpose of this guide, the term security review refers to any situation that triggers a manual review by a member of your security team. Typically, this includes having the authority to block a change if it's deemed too risky. If your security team doesn't have the ability to block risky changes, you'll still want to have processes in place to document the risk. This can help ensure whoever is pushing for the change knows the risk involved, and proactively accepts that risk.
Security review criteria
When should security reviews happen? Creating a set of criteria that triggers a security review helps ensure everyone is on the same page. Below are some examples of scenarios that might trigger a security review.
- New functionality related to sensitive user data is proposed
- A new feature that allows users to share their location on map
- Requesting potentially sensitive information from users, such as their home address, date of birth, or phone number
- Major updates to existing functionality is made
- Taking existing user data and using it in a new way that users might not expect without giving them an opportunity to opt out
- Changes to any features related to authentication, authorization, and session management
- Changes to the company's production environment
- Network configuration changes, especially changes that might result in exposing services externally
- Installation of new software that handles sensitive user data, that if compromised could indirectly be used to access sensitive user data
- Standing up new systems or services
- Interacting with a new vendor or changing how you work with an existing
- Onboarding a new vendor that will handle sensitive user data
- Changes to how you work with an existing vendor that results in the vendor handling sensitive user data
This is not an exhaustive list, but it should get you thinking about what sorts of changes should require a security review. As you define the criteria of what does and doesn't require a security review, talk it over with key stakeholders across the organization to ensure:
- Stakeholders have a chance to review and provide feedback on the criteria
- Stakeholders agree to the criteria
- Stakeholders agree to proactively request security reviews
Document this criteria, as well as how to request a security review (for example, filing a bug to a queue the security team monitors) to make it as easy as possible for your organization to follow this process.
Security review resourcing
Unlike automated scans, security reviews can be more resource intensive to perform. Every security team only has so much time in the day to accomplish a myriad of tasks, so you'll need to estimate how many security reviews may be generated based on your criteria. If you find your team is overwhelmed and falling behind, those waiting for their features to launch will become upset with the security team. This can cause a cultural shift in the organization causing the security team to be viewed as a blocker instead of a partner. If the security review process isn't efficient, many individuals and teams will try to bypass it completely. If resources are tight, consider loosening your criteria for requiring a security review, and be willing to accept some more residual risk. If incidents do occur as a result of lack of resources to perform security reviews, this will help justify the need for more security resources.
Performing security reviews
When it comes to deciding on which security reviews to perform and how to perform them, you'll need a prioritized queue to pull from. Create a standardized way for others in your organization to request a security review with whatever information you'll require to prioritize it appropriately. For example, consider a questionnaire that includes items such as the nature of the change, including a brief summary of the change and what types of user data may be impacted. You can automatically categorize potential security reviews into high, medium, or low risk changes based on the answers to these questions. If a change is high risk, you may require a more in depth security review process. If a change is lower risk, you may be able to implement a more lightweight security review process to help reduce resources required and speed up the process, better enabling the business. Consider setting up a rotation within your team to be responsible for managing the security review queue, ensuring that new security reviews are picked up by members of your team, and following up on progress of existing security reviews. The actual process of the security review will vary depending on what's being examined. For example, a new feature in your mobile app might require a security engineer to review the code and look for potential vulnerabilities. New software being installed might need to be reviewed to ensure access control is set up appropriately. Working with outside vendors can present an entirely different process. For reference, read through Google's Vendor Security Assessment Questionnaire.
Finding bugs is important, but security only improves after those bugs are fixed. Knowing what risks exist to your organization is good, but being able to efficiently address that risk is better.
Vulnerabilities come from a variety of resources, including internal efforts (for example, vulnerability scans and security reviews), third party penetration tests and audits, or even external security researchers that notify you through support channels before your VDP is officially launched. Your organization needs a way to categorize new and existing vulnerabilities to ensure they are communicated to the right stakeholders, prioritized correctly, and fixed in a timely manner. When you launch your VDP, you'll have a new stream of vulnerabilities entering your vulnerability management processes. Having solid processes in place for handling these vulnerabilities helps you track progress towards remediation and respond to requests from external security researchers for updates. Being able to quickly prioritize a vulnerability and communicate with VDP participants about remediation timeline will increase engagement with the security researcher community, as well as improve the reputation of your organization's security. The following sections outline various aspects of your vulnerability management program you'll want to have in place before launching your VDP.
Establish severity standards and remediation timelines
Creating a common language around the severity of vulnerabilities and ideal remediation timelines associated with each severity makes it easier to set standard expectations with your organization. If every vulnerability is treated like an emergency, your organization will exhaust their resources and grow resentful towards the security team. If every vulnerability is considered low priority, vulnerabilities will never get fixed, and the risk of a breach increases. Every organization has limited resources, so you'll need to establish a severity ranking. This ranking provides criteria that helps your organization understand what severity a vulnerability falls into, and expected remediation timelines associated with each severity. Draft a set of severity guidelines and share it with key stakeholders in your organization for feedback. For example, if engineering is involved in crafting your severity standards, they'll more likely buy-in to these standards and adhere to them when it comes time to fix a vulnerability within a specified timeframe. These severity guidelines may vary depending on what risks are specific to your business. You may want to consider a threat modeling exercise to think about what threats are most likely and impactful to your organization, and include examples of issues that would fall into each severity category. Below is an example of severity standards and remediation timelines for a financial organization.
|Critical||Issues that pose an imminent threat to our users or our business.||Owner: A primary owner for ensuring the vulnerability is
fixed should be identified within 8 hours. Call and page resources as
needed, even outside of normal business hours.
Fix: The issue itself should be fixed, or at least have risk mitigated, as soon as possible, or at most, within three business days.
|Compromise of a production database including all users' financial
An attacker gaining access to trade secrets, such as our proprietary investment algorithms.
An active incident including an attacker gaining access to our internal network or sensitive production systems.
|High||Issues that, if exploited, could cause significant damage.||Owner: A primary owner should be identified within one
Fix: Within 10 business days (~2 weeks).
|Vulnerabilities that could result in access to sensitive user data or functionality (e.g. ability for any user to steal funds from another user).|
|Medium||Issues that are harder to exploit or do not result in direct damage.||Owner: A primary owner should be identified within five
Fix: Within 20-40 business days (~1-2 months).
|Verified issues identified by automated scanners, such as patches for
security updates without known exploits.
Information disclosure issues that would likely help with further attacks.
Rate limiting issues that could potentially be exploited (e.g. being able to continuously guess passwords for a user).
|Low||Issues with minimal impact; primarily used for logging known issues.||No requirements for finding an owner or fixing within a specified timeline.||Information disclosure that does not present likely risk, but where the information does not need to be externally accessible.|
We're not talking about haircuts here, we're talking about ensuring bugs are formatted correctly so they can be easily fixed. Using the previous table as a guideline, establish your own severity definitions. These definitions are used to classify bugs into various severities and communicate them to owners.
On top of assigning each vulnerability a severity, you'll need to ensure your bugs are in a standard format that makes it easier for receiving teams to process. Vulnerabilities will enter your vulnerability management processes in a variety of formats (such as automated scanner results or manual write-ups from security reviews). Taking the time to convert each vulnerability into a standard format will increase the chances of the receiving team being able to quickly understand and address the issue.
This format or template might vary depending on your organization and what information is most pertinent to help owners fix bugs assigned to them, but here's an example template you can use. You'll be able to reuse this template later on when you create your vulnerability disclosure program submission form for researchers.Title: <one line description of the issue, usually the vulnerability type and what asset/service/etc. is affected; optionally include the severity, or map the severity to a field in your issue tracker> Summary: <brief description of the vulnerability and why it matters> Reproduction Steps: <step by step instructions on how to show the existence of the vulnerability> Impact / Attack Scenario: <how would this be exploited, and what would be the impact to your organization?> Remediation Steps: <how can this vulnerability be directly fixed, or any other advice to help at least mitigate risk associated with this issue>
Here is an example of potential high severity vulnerability:
Title: [HIGH] Insecure Direct Object Reference (IDOR) in profile pages Summary: An IDOR was discovered in our app's profile pages functionality that would allow any user to gain unauthorized access to view and edit another user's profile, including the other user's full name, home address, phone number, and date of birth. We've reviewed the logs and this issue does not seem to have been exploited yet. This issue was discovered internally. Reproduction steps:
- Set up a proxy for example, Burp Suite) to intercept traffic on a mobile device with the app installed.
- Visit your profile page and intercept the associated HTTP request.
- Modify profileID=###### to be profileID=000000 (this is a test user) and send along the HTTP request.
- The app will show the profile of user 000000, and you will be able to view and edit their information.
Attack scenario / impact: Any user can use this vulnerability to view and edit another user's profile. In the worst case scenario, an attacker could automate the process of retrieving every user's profile info in our entire system. While we don't believe this has been exploited yet, it's important that we treat this as a standard HIGH severity issue. If we observe evidence of exploitation, this could escalate to CRITICAL severity. Remediation steps: Implement server-side checks to ensure the user making the request should have access to view/edit the profile requested via the value of profileID. For example, if Alice is logged in and has profileID 123456, but Alice is observed to have requested profileID 333444, the user should see an error and this attempt to access another user's profile should be logged. For more information on IDOR and how to fix it, please see OWASP's materials on this bug.
You can save time and manual effort by finding ways to automate converting vulnerabilities from various sources into your standard format. As you create more vulnerabilities, you might find common themes in remediation steps. Beyond your generic bug format template, you might want to create additional templates for common vulnerability types.
Perhaps one of the most difficult aspects of vulnerability management is identifying owners to help fix bugs, as well as getting their buy-in to dedicate resources towards actually fixing bugs on schedule. If you've set up asset management processes, this will be a bit easier. If not, this might serve as motivation to do so. Depending on the size of your organization, finding an owner might be fairly simple, or incredibly complex. As your organization grows, the effort to determine who is responsible for fixing newly discovered security issues also grows. Consider implementing an operational on-duty rotation. Whoever is on-duty is responsible for reviewing unassigned vulnerabilities, tracking down owners, and prioritizing based on severity. Even if you're able to identify who is responsible for fixing a vulnerability and assign them to the bug, you'll also need to persuade them to invest time in actually fixing it. This approach might vary based on the team or individual, and what other items they are working on. If you've achieved organizational buy-in on your severity standards and remediation timelines, you can refer to those, but sometimes it may take extra persuasion to get someone to fix a bug. Here are some general tips for driving remediation of vulnerabilities:
- Explain why: When someone is assigned a vulnerability to fix, it's usually unexpected work. Explain why this issue is important to fix in a timely manner (e.g. the Impact / Attack Scenario) and ensure the owner understands.
- Gather context: In some cases, only one person has the knowledge necessary to fix a bug, and they might have other tasks they're working on. Take the time to find out what these are - it's possible that the other tasks may be more important than fixing this vulnerability in the near term. Demonstrating empathy and flexibility on remediation timelines will help earn goodwill and strengthen your relationship with those you need to fix vulnerabilities. Just be careful not to give too much leeway, otherwise your organization won't take your remediation timelines seriously.
- Explain how: Even if you include remediation advice in the bug, the owner of fixing the issue might be confused or need help learning how to fix the bug. If they need help figuring out how to fix it, help teach them. Simply throwing bugs at owners without helping them will hurt the organization's relationship with the security team. Helping others as much as possible will empower them to fix present and future vulnerabilities, as well as to help teach others.
- Adapt your request: Various teams and individuals may have existing processes for how they accept and prioritize incoming work requests. Some teams may want all incoming requests to come through their managers. Others will want requests for help to be submitted in a standard format. Some will only work on what's been predefined in a sprint. Whatever the case, taking some extra time to adapt your request to fit the format the team or individual usually uses to intake requests for help will increase the likelihood of your request being prioritized and actioned.
- Escalate as a last resort: If you've tried all of the above, and the individual or team responsible for fixing a vulnerability just won't take the time to fix a serious security issue, consider escalating to leadership as needed. This should always be a last resort, as it can damage your relationship with the individual or team in question.
Root cause analysis
In addition to finding and fixing individual vulnerabilities, performing root cause analysis (RCA) can help you identify and address systemic security issues. Everyone has limited resources, so it's tempting to skip this step. However, investing time into analyzing trends in your vulnerability data, as well as to look further into critical and high severity bugs, can save time and reduce risk in the long term. As an example, let's say you notice the same vulnerability type (for example, intent redirection) appear over and over again throughout your app. You decide to talk to the teams that are introducing this vulnerability into your app, and realize the large majority of them do not understand what intent redirection is, why it matters, or how to prevent it. You put together a talk and a guide to help educate developers in your organization about this vulnerability. This vulnerability probably will not completely disappear, but the rate at which it appears will likely decrease. When you launch your VDP, every vulnerability reported to you by a third party is something that slipped through your existing internal security processes. Performing RCA on bugs from your VDP will provide even further insight into how to systematically improve your security program.
Detection and response
Detection and response refers to any tooling and processes you have in place to detect and respond to potential attacks against your organization. This could come in the form of either purchased or self-developed solutions that analyze data to identify suspicious behavior. As an example, in the Grooming Bugs section we talked about logging every time a user attempts to gain unauthorized access to another user's profile. You might have a signal or alert that's generated if you notice a user generating a large number of failed attempts to access other user's profiles in a short period of time. You might even automate the process of blocking that user from accessing any of your services for a certain period, or indefinitely until the situation can be reviewed and restore access manually. If you don't already have detection and response mechanisms in place, consider working with an expert consultant to help guide you on how to build a digital forensics and incident response (DFIR) program for your organization. If you do have detection and response mechanisms already in place, you'll want to consider the consequences of having five, ten, or even one hundred security researchers testing against your Internet-facing attack surfaces. This can have a big impact on any IDS/IPS (intrusion detection and prevention systems) you have in place.
Potential risks include:
- Alerts overload: A flood of alerts or signals that look like malicious attacks, but are actually normal, approved testing from security researchers participating in your VDP. So much noise can be generated that it becomes difficult to distinguish real attacks from legitimate security testing.
- Incident response false alarms: If you have processes in place that page individuals at 2:00AM on Saturday, they will not be happy about waking up and investigating a potential breach that was actually just a security researcher performing legitimate testing.
- Blocking security researchers: If you have aggressive IPS (intrusion prevention systems) in place, you might end up blocking the IP address of a security researcher that's attempting to run scans, manual tests, etc. to identify vulnerabilities and report them to you. Especially in the case of a VDP, if a security researcher gets blocked after five minutes of testing, they may lose interest and instead focus on another organization's program. This can result in an overall lack of engagement in your program from security researchers, which increases the risk of vulnerabilities remaining undiscovered (and thus unknown to your organization). While you may not want to tone down your IPS itself, there are other measures you can take to mitigate the risk of disengaging researchers.
How to address these risks depends largely on what approach you want to take towards working with external security researchers. If you want a more black box style of testing that simulates real attacks, then you might do nothing. In this case, researcher's traffic will generate alerts and your team may take actions to investigate and respond accordingly. This will help your team get practice in responding to what looks like real attacks, but might decrease engagement with security researchers, especially if they are blocked from testing. It may also result in missing a real attack while spending time investigating legitimate testing. If you want more of a gray box approach, you can consider working with security researchers to self-identify their testing traffic in some way. This will enable you to whitelist or otherwise filter out traffic from their testing and resulting alerts. Your team will be able to better distinguish real attacks from approved testing, and researchers will be empowered to find and report vulnerabilities to you without being hindered by your intrusion prevention systems. Some organizations ask security researchers to submit a form to apply for a unique identifier that can be attached to headers in requests generated by the researcher. This enables the organization to whitelist traffic for the researcher, as well as the ability to identify if the researcher starts to go beyond the agreed scope of testing. If this happens, you can reach out to the researcher and ask them to hold off until you can work together on a testing plan.