# β οΈ Breaking
Many deprecated queue config options are removed (actually, they should
have been removed in 1.18/1.19).
If you see the fatal message when starting Gitea: "Please update your
app.ini to remove deprecated config options", please follow the error
messages to remove these options from your app.ini.
Example:
```
2023/05/06 19:39:22 [E] Removed queue option: `[indexer].ISSUE_INDEXER_QUEUE_TYPE`. Use new options in `[queue.issue_indexer]`
2023/05/06 19:39:22 [E] Removed queue option: `[indexer].UPDATE_BUFFER_LEN`. Use new options in `[queue.issue_indexer]`
2023/05/06 19:39:22 [F] Please update your app.ini to remove deprecated config options
```
Many options in `[queue]` are are dropped, including:
`WRAP_IF_NECESSARY`, `MAX_ATTEMPTS`, `TIMEOUT`, `WORKERS`,
`BLOCK_TIMEOUT`, `BOOST_TIMEOUT`, `BOOST_WORKERS`, they can be removed
from app.ini.
# The problem
The old queue package has some legacy problems:
* complexity: I doubt few people could tell how it works.
* maintainability: Too many channels and mutex/cond are mixed together,
too many different structs/interfaces depends each other.
* stability: due to the complexity & maintainability, sometimes there
are strange bugs and difficult to debug, and some code doesn't have test
(indeed some code is difficult to test because a lot of things are mixed
together).
* general applicability: although it is called "queue", its behavior is
not a well-known queue.
* scalability: it doesn't seem easy to make it work with a cluster
without breaking its behaviors.
It came from some very old code to "avoid breaking", however, its
technical debt is too heavy now. It's a good time to introduce a better
"queue" package.
# The new queue package
It keeps using old config and concept as much as possible.
* It only contains two major kinds of concepts:
* The "base queue": channel, levelqueue, redis
* They have the same abstraction, the same interface, and they are
tested by the same testing code.
* The "WokerPoolQueue", it uses the "base queue" to provide "worker
pool" function, calls the "handler" to process the data in the base
queue.
* The new code doesn't do "PushBack"
* Think about a queue with many workers, the "PushBack" can't guarantee
the order for re-queued unhandled items, so in new code it just does
"normal push"
* The new code doesn't do "pause/resume"
* The "pause/resume" was designed to handle some handler's failure: eg:
document indexer (elasticsearch) is down
* If a queue is paused for long time, either the producers blocks or the
new items are dropped.
* The new code doesn't do such "pause/resume" trick, it's not a common
queue's behavior and it doesn't help much.
* If there are unhandled items, the "push" function just blocks for a
few seconds and then re-queue them and retry.
* The new code doesn't do "worker booster"
* Gitea's queue's handlers are light functions, the cost is only the
go-routine, so it doesn't make sense to "boost" them.
* The new code only use "max worker number" to limit the concurrent
workers.
* The new "Push" never blocks forever
* Instead of creating more and more blocking goroutines, return an error
is more friendly to the server and to the end user.
There are more details in code comments: eg: the "Flush" problem, the
strange "code.index" hanging problem, the "immediate" queue problem.
Almost ready for review.
TODO:
* [x] add some necessary comments during review
* [x] add some more tests if necessary
* [x] update documents and config options
* [x] test max worker / active worker
* [x] re-run the CI tasks to see whether any test is flaky
* [x] improve the `handleOldLengthConfiguration` to provide more
friendly messages
* [x] fine tune default config values (eg: length?)
## Code coverage:
![image](https://user-images.githubusercontent.com/2114189/236620635-55576955-f95d-4810-b12f-879026a3afdf.png)
When the base repository contains multiple branches with the same
commits as the base branch, pull requests can show a long list of
commits already in the base branch as having been added.
What this is supposed to do is exclude commits already in the base
branch. But the mechansim to do so assumed a commit only exists in a
single branch. Now use `git rev-list A B --not branchName` instead of
filtering commits afterwards.
The logic to detect if there was a force push also was wrong for
multiple branches. If the old commit existed in any branch in the base
repository it would assume there was no force push. Instead check if the
old commit is an ancestor of the new commit.
Related to: #22294#23186#23054
Replace: #23218
Some discussion is in the comments of #23218.
Highlights:
- Add Expiration for cache context. If a cache context has been used for
more than 10s, the cache data will be ignored, and warning logs will be
printed.
- Add `discard` field to `cacheContext`, a `cacheContext` with `discard`
true will drop all cached data and won't store any new one.
- Introduce `WithNoCacheContext`, if one wants to run long-life tasks,
but the parent context is a cache context,
`WithNoCacheContext(perentCtx)` will discard the cache data, so it will
be safe to keep the context for a long time.
- It will be fine to treat an original context as a cache context, like
`GetContextData(context.Backgraud())`, no warning logs will be printed.
Some cases about nesting:
When:
- *A*, *B* or *C* means a cache context.
- ~*A*~, ~*B*~ or ~*C*~ means a discard cache context.
- `ctx` means `context.Backgrand()`
- *A(ctx)* means a cache context with `ctx` as the parent context.
- *B(A(ctx))* means a cache context with `A(ctx)` as the parent context.
- `With` means `WithCacheContext`
- `WithNo` means `WithNoCacheContext`
So:
- `With(ctx)` -> *A(ctx)*
- `With(With(ctx))` -> *A(ctx)*, not *B(A(ctx))*
- `With(With(With(ctx)))` -> *A(ctx)*, not *C(B(A(ctx)))*
- `WithNo(ctx)` -> *ctx*, not *~A~(ctx)*
- `WithNo(With(ctx))` -> *~A~(ctx)*
- `WithNo(WithNo(With(ctx)))` -> *~A~(ctx)*, not *~B~(~A~(ctx))*
- `With(WithNo(With(ctx)))` -> *B(~A~(ctx))*
- `WithNo(With(WithNo(With(ctx))))` -> *~B~(~A~(ctx))*
- `With(WithNo(With(WithNo(With(ctx)))))` -> *C(~B~(~A~(ctx)))*
To avoid duplicated load of the same data in an HTTP request, we can set
a context cache to do that. i.e. Some pages may load a user from a
database with the same id in different areas on the same page. But the
code is hidden in two different deep logic. How should we share the
user? As a result of this PR, now if both entry functions accept
`context.Context` as the first parameter and we just need to refactor
`GetUserByID` to reuse the user from the context cache. Then it will not
be loaded twice on an HTTP request.
But of course, sometimes we would like to reload an object from the
database, that's why `RemoveContextData` is also exposed.
The core context cache is here. It defines a new context
```go
type cacheContext struct {
ctx context.Context
data map[any]map[any]any
lock sync.RWMutex
}
var cacheContextKey = struct{}{}
func WithCacheContext(ctx context.Context) context.Context {
return context.WithValue(ctx, cacheContextKey, &cacheContext{
ctx: ctx,
data: make(map[any]map[any]any),
})
}
```
Then you can use the below 4 methods to read/write/del the data within
the same context.
```go
func GetContextData(ctx context.Context, tp, key any) any
func SetContextData(ctx context.Context, tp, key, value any)
func RemoveContextData(ctx context.Context, tp, key any)
func GetWithContextCache[T any](ctx context.Context, cacheGroupKey string, cacheTargetID any, f func() (T, error)) (T, error)
```
Then let's take a look at how `system.GetString` implement it.
```go
func GetSetting(ctx context.Context, key string) (string, error) {
return cache.GetWithContextCache(ctx, contextCacheKey, key, func() (string, error) {
return cache.GetString(genSettingCacheKey(key), func() (string, error) {
res, err := GetSettingNoCache(ctx, key)
if err != nil {
return "", err
}
return res.SettingValue, nil
})
})
}
```
First, it will check if context data include the setting object with the
key. If not, it will query from the global cache which may be memory or
a Redis cache. If not, it will get the object from the database. In the
end, if the object gets from the global cache or database, it will be
set into the context cache.
An object stored in the context cache will only be destroyed after the
context disappeared.
The update by rebase code reuses the merge code but shortcircuits and
pushes back up to the head. However, it doesn't set the correct pushing
environment - and just uses the same environment as the base repo. This
leads to the push update failing and thence the PR becomes out-of-sync
with the head.
This PR fixes this and adjusts the trace logging elsewhere to help make
this clearer.
Fix#18802
Signed-off-by: Andrew Thornton <art27@cantab.net>
Signed-off-by: Andrew Thornton <art27@cantab.net>
Co-authored-by: John Olheiser <john.olheiser@gmail.com>
After #22362, we can feel free to use transactions without
`db.DefaultContext`.
And there are still lots of models using `db.DefaultContext`, I think we
should refactor them carefully and one by one.
Co-authored-by: Lunny Xiao <xiaolunwen@gmail.com>
Change all license headers to comply with REUSE specification.
Fix#16132
Co-authored-by: flynnnnnnnnnn <flynnnnnnnnnn@github>
Co-authored-by: John Olheiser <john.olheiser@gmail.com>
This PR adds a context parameter to a bunch of methods. Some helper
`xxxCtx()` methods got replaced with the normal name now.
Co-authored-by: delvh <dev.lh@web.de>
Co-authored-by: Lunny Xiao <xiaolunwen@gmail.com>
Fix#19513
This PR introduce a new db method `InTransaction(context.Context)`,
and also builtin check on `db.TxContext` and `db.WithTx`.
There is also a new method `db.AutoTx` has been introduced but could be used by other PRs.
`WithTx` will always open a new transaction, if a transaction exist in context, return an error.
`AutoTx` will try to open a new transaction if no transaction exist in context.
That means it will always enter a transaction if there is no error.
Co-authored-by: delvh <dev.lh@web.de>
Co-authored-by: 6543 <6543@obermui.de>
For normal commits the notification url was wrong because oldCommitID is received from the shrinked commits list.
This PR moves the commits list shrinking after the oldCommitID assignment.
* Move access and repo permission to models/perm/access
* fix test
* Move some git related files into sub package models/git
* Fix build
* fix git test
* move lfs to sub package
* move more git related functions to models/git
* Move functions sequence
* Some improvements per @KN4CK3R and @delvh
* Move some repository related code into sub package
* Move more repository functions out of models
* Fix lint
* Some performance optimization for webhooks and others
* some refactors
* Fix lint
* Fix
* Update modules/repository/delete.go
Co-authored-by: delvh <dev.lh@web.de>
* Fix test
* Merge
* Fix test
* Fix test
* Fix test
* Fix test
Co-authored-by: delvh <dev.lh@web.de>
When a new branch is pushed the old SHA is always listed as the empty sha and thus the compare link that is created does not work correctly.
Therefore when creating the compare link for new branches:
1. Attempt to get the parent of the first commit and use that as the basis
for the compare link.
2. If this is not possible make a comparison to the default branch
3. Finally if that is not possible simply do not show a compare link.
However, there are multiple broken compare links remaining therefore, in order for these to not break we will simply make the compare link redirect to the default branch.
Fix#19144
Signed-off-by: a1012112796 <1012112796@qq.com>
Signed-off-by: Andrew Thornton <art27@cantab.net>
Co-authored-by: Andrew Thornton <art27@cantab.net>
* Remove `db.DefaultContext` usage in routers, use `ctx` directly
* Use `ctx` directly if there is one, remove some `db.DefaultContext` in `services`
* Use ctx instead of db.DefaultContext for `cmd` and some `modules` packages
* fix incorrect context usage
* Start adding mechanism to return unhandled data
Signed-off-by: Andrew Thornton <art27@cantab.net>
* Create pushback interface
Signed-off-by: Andrew Thornton <art27@cantab.net>
* Add Pausable interface to WorkerPool and Manager
Signed-off-by: Andrew Thornton <art27@cantab.net>
* Implement Pausable and PushBack for the bytefifos
Signed-off-by: Andrew Thornton <art27@cantab.net>
* Implement Pausable and Pushback for ChannelQueues and ChannelUniqueQueues
Signed-off-by: Andrew Thornton <art27@cantab.net>
* Wire in UI for pausing
Signed-off-by: Andrew Thornton <art27@cantab.net>
* add testcases and fix a few issues
Signed-off-by: Andrew Thornton <art27@cantab.net>
* fix build
Signed-off-by: Andrew Thornton <art27@cantab.net>
* prevent "race" in the test
Signed-off-by: Andrew Thornton <art27@cantab.net>
* fix jsoniter mismerge
Signed-off-by: Andrew Thornton <art27@cantab.net>
* fix conflicts
Signed-off-by: Andrew Thornton <art27@cantab.net>
* fix format
Signed-off-by: Andrew Thornton <art27@cantab.net>
* Add warnings for no worker configurations and prevent data-loss with redis/levelqueue
Signed-off-by: Andrew Thornton <art27@cantab.net>
* Use StopTimer
Signed-off-by: Andrew Thornton <art27@cantab.net>
Co-authored-by: Lauris BH <lauris@nix.lv>
Co-authored-by: 6543 <6543@obermui.de>
Co-authored-by: techknowlogick <techknowlogick@gitea.io>
Co-authored-by: wxiaoguang <wxiaoguang@gmail.com>
This PR continues the work in #17125 by progressively ensuring that git
commands run within the request context.
This now means that the if there is a git repo already open in the context it will be used instead of reopening it.
Signed-off-by: Andrew Thornton <art27@cantab.net>
* Some refactors related repository model
* Move more methods out of repository
* Move repository into models/repo
* Fix test
* Fix test
* some improvements
* Remove unnecessary function
* DBContext is just a Context
This PR removes some of the specialness from the DBContext and makes it context
This allows us to simplify the GetEngine code to wrap around any context in future
and means that we can change our loadRepo(e Engine) functions to simply take contexts.
Signed-off-by: Andrew Thornton <art27@cantab.net>
* fix unit tests
Signed-off-by: Andrew Thornton <art27@cantab.net>
* another place that needs to set the initial context
Signed-off-by: Andrew Thornton <art27@cantab.net>
* avoid race
Signed-off-by: Andrew Thornton <art27@cantab.net>
* change attachment error
Signed-off-by: Andrew Thornton <art27@cantab.net>
* Make modules/context.Context a context.Context
Signed-off-by: Andrew Thornton <art27@cantab.net>
* Simplify context calls
Signed-off-by: Andrew Thornton <art27@cantab.net>
* Set the base context for requests to the HammerContext
Signed-off-by: Andrew Thornton <art27@cantab.net>
* pass context into get-last-commit
Signed-off-by: Andrew Thornton <art27@cantab.net>
* Make commit_info cancellable
Signed-off-by: Andrew Thornton <art27@cantab.net>
* use context as context
Signed-off-by: Andrew Thornton <art27@cantab.net>
Co-authored-by: 6543 <6543@obermui.de>
PR #13381 refactored notification actions to call NotifyCreateRef and
NotifyDeleteRef in a slightly different manner. This leads to the pusher
not being set before the call.
This PR ensures that the pusher is updated before the call.
Fix#13940
Signed-off-by: Andrew Thornton <art27@cantab.net>
* Cache last commit when pushing for big repository
* Fix bug
* detect force push
* Refactor cache push
* Finish cache last commit info when push
* Some improvements
* Fix lint
* Remove unused changes
* Move pull request test before cache
* Fix test mysql
Co-authored-by: techknowlogick <techknowlogick@gitea.io>
Co-authored-by: Lauris BH <lauris@nix.lv>
We should only update is_empty, default_branch and updated time columns
during commitRepoAction and not update other columns as we risk
overwriting incorrect information.
Fix#11823Fix#10536
Signed-off-by: Andrew Thornton <art27@cantab.net>
Co-authored-by: techknowlogick <techknowlogick@gitea.io>
* Fix test
* Add no queue for test only
* improve code
* Auto watch whatever branch operation
* Fix lint
* Rename noqueue to immediate
* Remove old PushUpdate function
* Fix tests
Co-authored-by: zeripath <art27@cantab.net>
Co-authored-by: techknowlogick <techknowlogick@gitea.io>