feat: add timeout handling for cache database operations (#9307)

This commit is contained in:
Teppei Fukuda
2025-08-18 01:01:27 -07:00
committed by GitHub
parent 04ad0c4fc2
commit 235c24e71a
4 changed files with 64 additions and 11 deletions

View File

@@ -125,14 +125,57 @@ $ trivy image --download-java-db-only
$ trivy image [YOUR_JAVA_IMAGE]
```
### Running in parallel takes same time as series run
When running trivy on multiple images simultaneously, it will take same time as running trivy in series.
This is because of a limitation of boltdb.
> Bolt obtains a file lock on the data file so multiple processes cannot open the same database at the same time. Opening an already open Bolt database will cause it to hang until the other process closes it.
### Database and cache lock errors
Reference : [boltdb: Opening a database][boltdb].
!!! error
```
cache may be in use by another process
```
[boltdb]: https://github.com/boltdb/bolt#opening-a-database
!!! error
```
vulnerability database may be in use by another process
```
By default, Trivy uses BoltDB for its vulnerability database and cache storage. BoltDB creates file locks to prevent data corruption, which means only one process can access the same database file at a time.
As stated in the BoltDB documentation:
> Please note that Bolt obtains a file lock on the data file so multiple processes cannot open the same database at the same time. Opening an already open Bolt database will cause it to hang until the other process closes it.
Reference: [BoltDB README](https://github.com/boltdb/bolt#opening-a-database)
These errors occur when:
- Multiple Trivy processes try to use the same cache directory simultaneously
- A previous Trivy process did not shut down cleanly
- Trivy server is running and holding locks on the database and cache
#### Important Note
Running multiple Trivy processes on the same machine is **not recommended**. Using the same cache directory for multiple processes does not improve performance and can cause unexpected errors due to BoltDB's locking mechanism.
#### Solutions
**Solution 1: Terminate conflicting processes** (Recommended)
Check for running Trivy processes and terminate them:
```bash
$ ps aux | grep trivy
$ kill [process_id]
```
**Solution 2: Use different cache directories** (If multiple processes are absolutely necessary)
If you must run multiple Trivy processes on the same machine, specify different cache directories for each process:
```bash
$ trivy image --cache-dir /tmp/trivy-cache-1 debian:11 &
$ trivy image --cache-dir /tmp/trivy-cache-2 debian:12 &
```
Note that each cache directory will download its own copy of the vulnerability database and other scan assets, which will increase network traffic and storage usage.
### Multiple Trivy servers

2
go.mod
View File

@@ -24,7 +24,7 @@ require (
github.com/aquasecurity/testdocker v0.0.0-20250616060700-ba6845ac6d17
github.com/aquasecurity/tml v0.6.1
github.com/aquasecurity/trivy-checks v1.11.3-0.20250604022615-9a7efa7c9169
github.com/aquasecurity/trivy-db v0.0.0-20250723062229-56ec1e482238
github.com/aquasecurity/trivy-db v0.0.0-20250731052236-c7c831e2254d
github.com/aquasecurity/trivy-java-db v0.0.0-20240109071736-184bd7481d48
github.com/aquasecurity/trivy-kubernetes v0.9.1
github.com/aws/aws-sdk-go-v2 v1.37.1

4
go.sum
View File

@@ -829,8 +829,8 @@ github.com/aquasecurity/tml v0.6.1 h1:y2ZlGSfrhnn7t4ZJ/0rotuH+v5Jgv6BDDO5jB6A9gw
github.com/aquasecurity/tml v0.6.1/go.mod h1:OnYMWY5lvI9ejU7yH9LCberWaaTBW7hBFsITiIMY2yY=
github.com/aquasecurity/trivy-checks v1.11.3-0.20250604022615-9a7efa7c9169 h1:TckzIxUX7lZaU9f2lNxCN0noYYP8fzmSQf6a4JdV83w=
github.com/aquasecurity/trivy-checks v1.11.3-0.20250604022615-9a7efa7c9169/go.mod h1:nT69xgRcBD4NlHwTBpWMYirpK5/Zpl8M+XDOgmjMn2k=
github.com/aquasecurity/trivy-db v0.0.0-20250723062229-56ec1e482238 h1:ZT7cZan/iS/nD7D6CG4/AVdtqArKi9GtovlL4lEi/RY=
github.com/aquasecurity/trivy-db v0.0.0-20250723062229-56ec1e482238/go.mod h1:upAJqDQkN5FdIJbtJMpokncGNhYAPGkpoCbaGciWPt4=
github.com/aquasecurity/trivy-db v0.0.0-20250731052236-c7c831e2254d h1:Lc+p2CLARivVF48o7uRoFPaahNCvNFyBfeby0JqAMXo=
github.com/aquasecurity/trivy-db v0.0.0-20250731052236-c7c831e2254d/go.mod h1:upAJqDQkN5FdIJbtJMpokncGNhYAPGkpoCbaGciWPt4=
github.com/aquasecurity/trivy-java-db v0.0.0-20240109071736-184bd7481d48 h1:JVgBIuIYbwG+ekC5lUHUpGJboPYiCcxiz06RCtz8neI=
github.com/aquasecurity/trivy-java-db v0.0.0-20240109071736-184bd7481d48/go.mod h1:Ldya37FLi0e/5Cjq2T5Bty7cFkzUDwTcPeQua+2M8i8=
github.com/aquasecurity/trivy-kubernetes v0.9.1 h1:bSErQcavKXDh7XMwbGX7Vy//jR5+xhe/bOgfn9G+9lQ=

14
pkg/cache/fs.go vendored
View File

@@ -2,8 +2,10 @@ package cache
import (
"encoding/json"
"errors"
"os"
"path/filepath"
"time"
"github.com/hashicorp/go-multierror"
bolt "go.etcd.io/bbolt"
@@ -12,6 +14,8 @@ import (
"github.com/aquasecurity/trivy/pkg/fanal/types"
)
const defaultFSCacheTimeout = 5 * time.Second
var _ Cache = &FSCache{}
type FSCache struct {
@@ -25,9 +29,15 @@ func NewFSCache(cacheDir string) (FSCache, error) {
return FSCache{}, xerrors.Errorf("failed to create cache dir: %w", err)
}
db, err := bolt.Open(filepath.Join(dir, "fanal.db"), 0o600, nil)
db, err := bolt.Open(filepath.Join(dir, "fanal.db"), 0o600, &bolt.Options{
Timeout: defaultFSCacheTimeout,
})
if err != nil {
return FSCache{}, xerrors.Errorf("unable to open DB: %w", err)
// Check if the error is due to timeout (database locked by another process)
if errors.Is(err, bolt.ErrTimeout) {
return FSCache{}, xerrors.Errorf("cache may be in use by another process: %w", err)
}
return FSCache{}, xerrors.Errorf("unable to open cache DB: %w", err)
}
err = db.Update(func(tx *bolt.Tx) error {