haproxy

mirror of http://git.haproxy.org/git/haproxy.git/ synced 2025-02-20 12:46:56 +00:00

Author	SHA1	Message	Date
Willy Tarreau	09e266e6f5	MINOR: proto: skip socket setup for duped FDs It's not strictly necessary, but it's still better to avoid setting up the same socket multiple times when it's being duplicated to a few FDs. We don't change that for inherited ones however since they may really need to be set up, so we only skip duplicated ones.	2023-04-21 17:41:26 +02:00
Willy Tarreau	0e1aaf4e78	MEDIUM: proto: duplicate receivers marked RX_F_MUST_DUP The different protocol's ->bind() function will now check the receiver's RX_F_MUST_DUP flag to decide whether to bind a fresh new listener from scratch or reuse an existing one and just duplicate it. It turns out that the existing code already supports reusing FDs since that was done as part of the FD passing and inheriting mechanism. Here it's not much different, we pass the FD of the reference receiver, it gets duplicated and becomes the new receiver's FD. These FDs are also marked RX_F_INHERITED so that they are not exported and avoid being touched directly (only the reference should be touched).	2023-04-21 17:41:26 +02:00
Willy Tarreau	e4c36aa8a1	MINOR: receiver: add RX_F_MUST_DUP to indicate that an rx must be duped The purpose of this new flag will be to mark that some listeners duplicate their reference's FD instead of trying to setup a completely new listener from scratch. This will be used when multiple groups want to listen to the same socket, via multiple FDs.	2023-04-21 17:41:26 +02:00
Willy Tarreau	aae1810b4d	MINOR: receiver: add a struct shard_info to store info about each shard In order to create multiple receivers for one multi-group shard, we'll need some more info about the shard. Here we store: - the number of groups (= number of receivers) - the number of threads (will be used for accept LB) - pointer to the reference rx (to get the FD and to find all threads) - pointers to the other members (to iterate over all threads) For now since there's only one group per shard it remains simple. The listener deletion code already takes care of removing the current member from its shards list and moving others' reference to the last one if it was their reference (so as to avoid o(n^2) updates during ordered deletes). Since the vast majority of setups will not use multi-group shards, we try to save memory usage by only allocating the shard_info when it is needed, so the principle here is that a receiver shard_info==NULL is alone and doesn't share its socket with another group. Various approaches were considered and tests show that the management of the listeners during boot makes it easier to just attach to or detach from a shard_info and automatically allocate it if it does not exist, which is what is being done here. For now the attach code is not called, but detach is already called on delete.	2023-04-21 17:41:26 +02:00
Willy Tarreau	84fe1f479b	MINOR: listener: support another thread dispatch mode: "fair" This new algorithm for rebalancing incoming connections to multiple threads is simpler and instead of considering the threads load, it will only cycle through all of them, offering a fair share of the traffic to each thread. It may be well suited for short-lived connections but is also convenient for very large thread counts where it's not always certain that the least loaded thread will always be found.	2023-04-21 17:41:26 +02:00
Willy Tarreau	6a4d48b736	MINOR: quic_sock: index li->per_thr[] on local thread id, not global one There's a li_per_thread array in each listener for use with QUIC listeners. Since thread groups were introduced, this array can be allocated too large because global.nbthread is allocated for each listener, while only no more than MIN(nbthread,MAX_THREADS_PER_GROUP) may be used by a single listener. This was because the global thread ID is used as the index instead of the local ID (since a listener may only be used by a single group). Let's just switch to local ID and reduce the allocated size.	2023-04-21 17:41:26 +02:00
Willy Tarreau	77d37b07b1	MINOR: quic: support migrating the listener as well When migrating a quic_conn to another thread, we may need to also switch the listener if the thread belongs to another group. When this happens, the freshly created connection will already have the target listener, so let's just pick it from the connection and use it in qc_set_tid_affinity(). Note that it will be the caller's responsibility to guarantee this.	2023-04-21 17:41:26 +02:00
Aurelien DARRAGON	23f352f7d0	MINOR: server/event_hdl: prepare for server event data wrapper Adding the possibility to publish an event using a struct wrapper around existing SERVER events to provide additional contextual info. Using the specific struct wrapper is not required: it is supported to cast event data as a regular server event data struct so that we don't break the existing API. However, casting event data with a more explicit data type allows to fetch event-only relevant hints.	2023-04-21 14:36:45 +02:00
Aurelien DARRAGON	f71e0645c1	MEDIUM: server: split srv_update_status() in two functions Considering that srv_update_status() is now synchronous again since `3ff577e1` ("MAJOR: server: make server state changes synchronous again"), and that we can easily identify if the update is from an operational or administrative context thanks to "MINOR: server: pass adm and op cause to srv_update_status()". And given that administrative and operational updates cannot be cumulated (since srv_update_status() is called synchronously and independently for admin updates and state/operational updates, and the function directly consumes the changes). We split srv_update_status() in 2 distinct parts: Either <type> is 0, meaning the update is an operational update which is handled by directly looking at cur_state and next_state to apply the proper transition. Also, the check to prevent operational state from being applied if MAINT admin flag is set is no longer needed given that the calling functions already ensure this (ie: srv_set_{running,stopping,stopped) Or <type> is 1, meaning the update is an administrative update, where cur_admin and next_admin are evaluated to apply the proper transition and deduct the resulting server state (next_state is updated implicitly). Once this is done, both operations share a common code path in srv_update_status() to update proxy and servers stats if required. Thanks to this change, the function's behavior is much more predictable, it is not an all-in-one function anymore. Either we apply an operational change, else it is an administrative change. That's it, we cannot mix the 2 since both code paths are now properly separated.	2023-04-21 14:36:45 +02:00
Aurelien DARRAGON	76e255520f	MINOR: server: pass adm and op cause to srv_update_status() Operational and administrative state change causes are not propagated through srv_update_status(), instead they are directly consumed within the function to provide additional info during the call when required. Thus, there is no valid reason for keeping adm and op causes within server struct. We are wasting space and keeping uneeded complexity. We now exlicitly pass change type (operational or administrative) and associated cause to srv_update_status() so that no extra storage is needed since those values are only relevant from srv_update_status().	2023-04-21 14:36:45 +02:00
Aurelien DARRAGON	10518c0d59	CLEANUP: server: fix srv_set_{running, stopping, stopped} function comment Fixing function comments for the server state changing function since they still refer to asynchonous propagation of server state which is no longer in play. Moreover, there were some mixups between running/stopping.	2023-04-21 14:36:45 +02:00
Aurelien DARRAGON	c54b98ac9a	CLEANUP: server: remove unused variables in srv_update_status() check and px local variable aliases are not very useful. Let's remove them and use s->check and s->proxy instead.	2023-04-21 14:36:45 +02:00
Aurelien DARRAGON	1746b56e68	MINOR: server: change srv_op_st_chg_cause storage type This one is greatly inspired by "MINOR: server: change adm_st_chg_cause storage type". While looking at current srv_op_st_chg_cause usage, it was clear that the struct needed some cleanup since some leftovers from asynchronous server state change updates were left behind and resulted in some useless code duplication, and making the whole thing harder to maintain. Two observations were made: - by tracking down srv_set_{running, stopped, stopping} usage, we can see that the <reason> argument is always a fixed statically allocated string. - check-related state change context (duration, status, code...) is not used anymore since srv_append_status() directly extracts the values from the server->check. This is pure legacy from when the state changes were applied asynchronously. To prevent code duplication, useless string copies and make the reason/cause more exportable, we store it as an enum now, and we provide srv_op_st_chg_cause() function to fetch the related description string. HEALTH and AGENT causes (check related) are now explicitly identified to make consumers like srv_append_op_chg_cause() able to fetch checks info from the server itself if they need to.	2023-04-21 14:36:45 +02:00
Aurelien DARRAGON	f3b48a808e	MINOR: server: srv_append_status refacto srv_append_status() has become a swiss-knife function over time. It is used from server code and also from checks code, with various inputs and distincts code paths, making it very hard to guess the actual behavior of the function (resulting string output). To simplify the logic behind it, we're dividing it in multiple contextual functions that take simple inputs and do explicit things, making them more predictable and easier to maintain.	2023-04-21 14:36:45 +02:00
Aurelien DARRAGON	9b1ccd7325	MINOR: server: change adm_st_chg_cause storage type Even though it doesn't look like it at first glance, this is more like a cleanup than an actual code improvement: Given that srv->adm_st_chg_cause has been used to exclusively store static strings ever since it was implemented, we make the choice to store it as an enum instead of a fixed-size string within server struct. This will allow to save some space in server struct, and will make it more easily exportable (ie: event handlers) because of the reduced memory footprint during handling and the ability to later get the corresponding human-readable message when it's explicitly needed.	2023-04-21 14:36:45 +02:00
Aurelien DARRAGON	85b91375bf	MINOR: server: propagate lb changes through srv_lb_propagate() Now that we have a generic srv_lb_propagate(s) function, let's use it each time we explicitly wan't to set the status down as well. Indeed, it is tricky to try to handle "down" case explicitly, instead we use srv_lb_propagate() which will call the proper function that will handle the new server state. This will allow some code cleanup and will prevent any logic error. This commit depends on: - "MINOR: server: propagate server state change to lb through single function"	2023-04-21 14:36:45 +02:00
Aurelien DARRAGON	8bbe643acc	MINOR: server: propagate server state change to lb through single function Use a dedicated helper function to propagate server state change to lb algorithms, since it is performed at multiple places within srv_update_status() function.	2023-04-21 14:36:45 +02:00
Aurelien DARRAGON	5f80f8bbc5	MINOR: server: central update for server counters on state change Based on "BUG/MINOR: server: don't miss server stats update on server state transitions", we're also taking advantage of the new centralized logic to update down_trans server counter directly from there instead of multiple places.	2023-04-21 14:36:45 +02:00
Aurelien DARRAGON	9c21ff0208	BUG/MINOR: server: don't use date when restoring last_change from state file When restoring from a state file: the server "Status" reports weird values on the html stats page: "5s UP" becomes -> "? UP" after the restore This is due to a bug in srv_state_srv_update(): when restoring the states from a state file, we rely on date.tv_sec to compute the process-relative server last_change timestamp. This is wrong because everywhere else we use now.tv_sec when dealing with last_change, for instance in srv_update_status(). date (which is Wall clock time) deviates from now (monotonic time) in the long run. They should not be mixed, and given that last_change is an internal time value, we should rely on now.tv_sec instead. last_change export through "show servers state" cli is safe since we export a delta and not the raw time value in dump_servers_state(): srv_time_since_last_change = now.tv_sec - srv->last_change -- While this bug affects all stable versions, it was revealed in 2.8 thanks to `28360dc` ("MEDIUM: clock: force internal time to wrap early after boot") This is due to the fact that "now" immediately deviates from "date", whereas in the past they had the same value when starting. Thus prior to 2.8 the bug is trickier since it could take some time for date and now to deviate sufficiently for the issue to arise, and instead of reporting absurd values that are easy to spot it could just result in last_change becoming inconsistent over time. As such, the fix should be backported to all stable versions. [for 2.2 the patch needs to be applied manually since srv_state_srv_update() was named srv_update_state() and can be found in server.c instead of server_state.c]	2023-04-21 14:36:45 +02:00
Aurelien DARRAGON	9f5853fa38	BUG/MINOR: server: don't miss server stats update on server state transitions s->last_change and s->down_time updates were manually updated for each effective server state change within srv_update_status(). This is rather error-prone, and as a result there were still some state transitions that were not handled properly since at least 1.8. ie: - when transitionning from DRAIN to READY: downtime was updated (which is wrong since a server in DRAIN state should not be considered as DOWN) - when transitionning from MAINT to READY: downtime was not updated (this can be easily seen in the html stats page) To fix these all at once, and prevent similar bugs from being introduced, we centralize the server last_change and down_time stats logic at the end of srv_update_status(): If the server state changed during the call, then it means that last_change must be updated, with a special case when changing from STOPPED state which means the server was previously DOWN and thus downtime should be updated. This patch depends on: - "MINOR: server: explicitly commit state change in srv_update_status()" This could be backported to every stable versions.	2023-04-21 14:36:45 +02:00
Aurelien DARRAGON	e80ddb18a8	BUG/MINOR: server: don't miss proxy stats update on server state transitions backend "down" stats logic has been duplicated multiple times in srv_update_status(), resulting in the logic now being error-prone. For example, the following bugfix was needed to compensate for a copy-paste introduced bug: `d332f139` ("BUG/MINOR: server: update last_change on maint->ready transitions too") While the above patch works great, we actually forgot to update the proxy downtime like it is done for other down->up transitions... This is simply illustrating that the current design is error-prone, it is very easy to miss something in this area. To properly update the proxy downtime stats on the maint->ready transition, to cleanup srv_update_status() and to prevent similar bugs from being introduced in the future, proxy/backend stats update are now automatically performed at the end of the server state change if needed. Thus we can remove existing updates that were performed at various places within the function, this simplifies things a bit. This patch depends on: - "MINOR: server: explicitly commit state change in srv_update_status()" This could be backported to all stable versions. Backport notes: 2.2: Replace struct task srv_cleanup_toremove_conns(struct task task, void context, unsigned int state) by struct task srv_cleanup_toremove_connections(struct task task, void context, unsigned short state)	2023-04-21 14:36:45 +02:00
Aurelien DARRAGON	22151c70bb	MINOR: server: explicitly commit state change in srv_update_status() As shown in `8f29829` ("BUG/MEDIUM: checks: a down server going to maint remains definitely stucked on down state."), state changes that don't result in explicit lb state change, require us to perform an explicit server state commit to make sure the next state is applied before returning from the function. This is the case for server state changes that don't trigger lb logic and only perform some logging. This is quite error prone, we could easily forget a state change combination that could result in next_state, next_admin or next_eweight not being applied. (cur_state, cur_admin and cur_eweight would be left with unexpected values) To fix this, we explicitly call srv_lb_commit_status() at the end of srv_update_status() to enforce the new values, even if they were already applied. (when a state changes requires lb state update an implicit commit is already performed) Applying the state change multiple times is safe (since the next value always points to the current value). Backport notes: 2.2: Replace struct task srv_cleanup_toremove_conns(struct task task, void context, unsigned int state) by struct task srv_cleanup_toremove_connections(struct task task, void context, unsigned short state)	2023-04-21 14:36:45 +02:00
Aurelien DARRAGON	9a1df02ccb	BUG/MINOR: server: incorrect report for tracking servers leaving drain Report message for tracking servers completely leaving drain is wrong: The check for "leaving drain .. via" never evaluates because the condition !(s->next_admin & SRV_ADMF_FDRAIN) is always true in the current block which is guarded by !(s->next_admin & SRV_ADMF_DRAIN). For tracking servers that leave inherited drain mode, this results in the following message being emitted: "Server x/b is UP (leaving forced drain)" Instead of: "Server x/b is UP (leaving drain) via x/a" To this fix: we check if FDRAIN is currently set, else it means that the drain status is inherited from the tracked server (IDRAIN) This regression was introduced with `64cc49cf` ("MAJOR: servers: propagate server status changes asynchronously."), thus it may be backported to every stable versions.	2023-04-21 14:36:45 +02:00
Aurelien DARRAGON	2dac67af7d	DOC: lua: restore 80 char limitation Restore 80 char limitation throughout the file for easier reading on the cli, and fix some raw formatting issues without altering html rendering.	2023-04-21 14:36:45 +02:00
Aurelien DARRAGON	096b383e16	MINOR: hlua/event_hdl: timestamp for events 'when' optional argument is provided to lua event handlers. It is an integer representing the number of seconds elapsed since Epoch and may be used in conjunction with lua `os.date()` function to provide a custom format string.	2023-04-21 14:36:45 +02:00
Aurelien DARRAGON	e9314fb7a7	MINOR: event_hdl: provide event->when for advanced handlers For advanced async handlers only (Registered using EVENT_HDL_ASYNC_TASK() macro): event->when is provided as a struct timeval and fetched from 'date' haproxy global variable. Thanks to 'when', related event consumers will be able to timestamp events, even if they don't work in real-time or near real-time. Indeed, unlike sync or normal async handlers, advanced async handlers could purposely delay the consumption of pending events, which means that the date wouldn't be accurate if computed directly from within the handler.	2023-04-21 14:36:45 +02:00
Aurelien DARRAGON	ebf58e991a	MINOR: event_hdl: dynamically allocated event data members Add the ability to provide a cleanup function for event data passed via the publishing function. One use case could be the need to provide valid pointers in the safe section of the data struct. Cleanup function will be automatically called with data (or copy of data) as argument when all handlers consumed the event, which provides an easy way to release some memory or decrement refcounts to ressources that were provided through the data struct. data in itself may not be freed by the cleanup function, it is handled by the API. This would allow passing large (allocated) data blocks through the data struct while keeping data struct size under the EVENT_HDL_ASYNC_EVENT_DATA size limit. To do so, when publishing an event, where we would currently do: struct event_hdl_cb_data_new_family event_data; /* safe data, available from both sync and async contexts * may not use pointers to short-living resources / event_data.safe.my_custom_data = x; / unsafe data, only available from sync contexts / event_data.unsafe.my_unsafe_data = y; / once data is prepared, we can publish the event / event_hdl_publish(NULL, EVENT_HDL_SUB_NEW_FAMILY_SUBTYPE_1, EVENT_HDL_CB_DATA(&event_data)); We could do: struct event_hdl_cb_data_new_family event_data; / safe data, available from both sync and async contexts * may not use pointers to short-living resources, * unless EVENT_HDL_CB_DATA_DM is used to ensure pointer * consistency (ie: refcount) / event_data.safe.my_custom_static_data = x; event_data.safe.my_custom_dynamic_data = malloc(1); / unsafe data, only available from sync contexts / event_data.unsafe.my_unsafe_data = y; / once data is prepared, we can publish the event / event_hdl_publish(NULL, EVENT_HDL_SUB_NEW_FAMILY_SUBTYPE_1, EVENT_HDL_CB_DATA_DM(&event_data, data_new_family_cleanup)); With data_new_family_cleanup func which would look like this: void data_new_family_cleanup(const void data) { const struct event_hdl_cb_data_new_family event_data = ptr; / some data members require specific cleanup once the event * is consumed / free(event_data.safe.my_custom_dynamic_data); / don't ever free data! it is not ours */ } Not sure if this feature will become relevant in the future, so I prefer not to mention it in the doc for now. But given that the implementation is trivial and does not put a burden on the existing API, it's a good thing to have it there, just in case.	2023-04-21 14:36:45 +02:00
Aurelien DARRAGON	147691fd83	CLEANUP: event_hdl: fix comment typo about _sync assertion Fixing a comment relative to EVENT_HDL_ASSERT_SYNC macro where a typo was made and the comment was lacking some context.	2023-04-21 14:36:45 +02:00
Aurelien DARRAGON	363ef4daa7	CLEANUP: event_hdl: updating obsolete comment for EVENT_HDL_CB_DATA EVENT_HDL_CB_DATA macro comments were not updated during the API refactor, fixing that.	2023-04-21 14:36:45 +02:00
Aurelien DARRAGON	8273bfc639	BUG/MINOR: event_hdl: don't waste 1 event subtype slot ESUB_INDEX(n) index macro is used exclusively with n > 0 Fixing it so that it starts numbering at 1 instead of 2. This way, we don't waste a subtype slot in event_hdl_sub_type struct, and we comply with the structure comments about max supported event subtypes (currently set at 16). If `68e692da0` ("MINOR: event_hdl: add event handler base api") is being backported, then this commit should be backported with it.	2023-04-21 14:36:45 +02:00
Aurelien DARRAGON	a63f4903c9	MINOR: server/event_hdl: prepare for upcoming refactors This commit does nothing that ought to be mentioned, except that it adds missing comments and slighty moves some function calls out of "sensitive" code in preparation of some server code refactors.	2023-04-21 14:36:45 +02:00
Aurelien DARRAGON	2f6a07dce8	MINOR: hlua/event_hdl: fix return type for hlua_event_hdl_cb_data_push_args Changing hlua_event_hdl_cb_data_push_args() return type to void since it does not return anything useful. Also changing its name to hlua_event_hdl_cb_push_args() since it does more than just pushing cb data argument (it also handles event type and mgmt). Errors catched by the function are reported as lua errors.	2023-04-21 14:36:45 +02:00
Aurelien DARRAGON	55f84c7cab	MINOR: hlua/event_hdl: expose proxy_uuid variable in server events Adding proxy_uuid to ServerEvent class. proxy_uuid contains the uuid of the proxy to which the server belongs	2023-04-21 14:36:45 +02:00
Aurelien DARRAGON	3d9bf4e1a5	MINOR: hlua/event_hdl: rely on proxy_uuid instead of proxy_name for lookups Since "MINOR: server/event_hdl: add proxy_uuid to event_hdl_cb_data_server" we may now use proxy_uuid variable to perform proxy lookups when handling a server event. It is more reliable since proxy_uuid isn't subject to any size limitation	2023-04-21 14:36:45 +02:00
Aurelien DARRAGON	d714213862	MINOR: server/event_hdl: add proxy_uuid to event_hdl_cb_data_server Expose proxy_uuid variable in event_hdl_cb_data_server struct to overcome proxy_name fixed length limitation. proxy_uuid may be used by the handler to perform proxy lookups. This should be preferred over lookups relying proxy_name. (proxy_name is suitable for printing / logging purposes but not for ID lookups since it has a maximum fixed length)	2023-04-21 14:36:45 +02:00
Aurelien DARRAGON	0ddf052972	CLEANUP: server: fix update_status() function comment srv_update_status() function comment says that the function "is designed to be called asynchronously". While this used to be true back then with `64cc49cf` ("MAJOR: servers: propagate server status changes asynchronously.") This is not true anymore since `3ff577e` ("MAJOR: server: make server state changes synchronous again") Fixing the comment in order to better reflect current behavior.	2023-04-21 14:36:45 +02:00
Aurelien DARRAGON	88687f0980	CLEANUP: errors: fix obsolete function comments Since `9f903af5` ("MEDIUM: log: slightly refine the output format of alerts/warnings/etc"), messages generated by ha_{alert,warning,notice} don't embed date/time information anymore. Updating some old function comments that kept saying otherwise.	2023-04-21 14:36:45 +02:00
Amaury Denoyelle	a65dd3a2c8	BUG/MINOR: quic: consume Rx datagram even on error A BUG_ON crash can occur on qc_rcv_buf() if a Rx packet allocation failed. To fix this, datagram are marked as consumed even if a fatal error occured during parsing. For the moment, only a Rx packet allocation failure could provoke this. At this stage, it's unknown if the datagram were partially parsed or not at all so it's better to discard it completely. This bug was detected using -dMfail argument. This should be backported up to 2.7.	2023-04-20 14:49:32 +02:00
Amaury Denoyelle	d537ca79dc	BUG/MINOR: quic: prevent crash on qc_new_conn() failure Properly initialize el_th_ctx member first on qc_new_conn(). This prevents a segfault if release should be called later due to memory allocation failure in the function on qc_detach_th_ctx_list(). This should be backported up to 2.7.	2023-04-20 14:49:32 +02:00
Amaury Denoyelle	9bbfa72b67	BUG/MINOR: h3: fix crash on h3s alloc failure Do not emit a CONNECTION_CLOSE on h3s allocation failure. Indeed, this causes a crash as the calling function qcs_new() will also try to emit a CONNECTION_CLOSE which triggers a BUG_ON() on qcc_emit_cc(). This was reproduced using -dMfail. This should be backported up to 2.7.	2023-04-20 14:49:32 +02:00
Amaury Denoyelle	93d2ebe9f3	BUG/MINOR: mux-quic: properly handle STREAM frame alloc failure Previously, if a STREAM frame cannot be allocated for emission, a crash would occurs due to an ABORT_NOW() statement in _qc_send_qcs(). Replace this by proper error code handling. Each stream were sending fails are removed temporarily from qcc::send_list to a list local to _qc_send_qcs(). Once emission has been conducted for all streams, reinsert failed stream to qcc::send_list. This avoids to reloop on failed streams on the second while loop at the end of _qc_send_qcs(). This crash was reproduced using -dMfail. This should be backported up to 2.6.	2023-04-20 14:49:32 +02:00
Amaury Denoyelle	ed820823f0	BUG/MINOR: mux-quic: fix crash with app ops install failure On MUX initialization, the application layer is setup via qcc_install_app_ops(). If this function fails MUX is deallocated and an error is returned. This code path causes a crash before connection has been registered prior into the mux_stopping_data::list for stopping idle frontend conns. To fix this, insert the connection later in qc_init() once no error can occured. The crash was seen on the process closing with SUGUSR1 with a segfault on mux_stopping_process(). This was reproduced using -dMfail. This regression was introduced by the following patch : commit `b4d119f0c7` BUG/MEDIUM: mux-quic: fix crash on H3 SETTINGS emission This should be backported up to 2.7.	2023-04-20 14:49:32 +02:00
Frédéric Lécaille	d07421331f	BUG/MINOR: quic: Wrong Retry token generation timestamp computing Again a now_ms variable value used without the ticks API. It is used to store the generation time of the Retry token to be received back from the client. Must be backported to 2.6 and 2.7.	2023-04-19 17:31:28 +02:00
Frédéric Lécaille	45662efb2f	BUG/MINOR: quic: Unchecked buffer length when building the token As server, an Initial does not contain a token but only the token length field with zero as value. The remaining room was not checked before writting this field. Must be backported to 2.6 and 2.7.	2023-04-19 11:36:54 +02:00
Frédéric Lécaille	0ed94032b2	MINOR: quic: Do not allocate too much ack ranges Limit the maximum number of ack ranges to QUIC_MAX_ACK_RANGES(32). Must be backported to 2.6 and 2.7.	2023-04-19 11:36:54 +02:00
Frédéric Lécaille	4b2627beae	BUG/MINOR: quic: Stop removing ACK ranges when building packets Since this commit: BUG/MINOR: quic: Possible wrapped values used as ACK tree purging limit. There are more chances that ack ranges may be removed from their trees when building a packet. It is preferable to impose a limit to these trees. This will be the subject of the a next commit to come. For now on, it is sufficient to stop deleting ack range from their trees. Remove quic_ack_frm_reduce_sz() and quic_rm_last_ack_ranges() which were there to do that. Make qc_frm_len() support ACK frames and calls it to ensure an ACK frame may be added to a packet before building it. Must be backported to 2.6 and 2.7.	2023-04-19 11:36:54 +02:00
Aurelien DARRAGON	8cd620b46f	MINOR: hlua: safe coroutine.create() Overriding global coroutine.create() function in order to link the newly created subroutine with the parent hlua ctx. (hlua_gethlua() function from a subroutine will return hlua ctx from the hlua ctx on which the coroutine.create() was performed, instead of NULL) Doing so allows hlua_hook() function to support being called from subroutines created using coroutine.create() within user lua scripts. That is: the related subroutine will be immune to the forced-yield, but it will still be checked against hlua timeouts. If the subroutine fails to yield or finish before the timeout, the related lua handler will be aborted (instead of going rogue unnoticed like it would be the case prior to this commit)	2023-04-19 11:03:31 +02:00
Aurelien DARRAGON	cf0f792490	MINOR: hlua: hook yield on known lua state When forcing a yield attempt from hlua_hook(), we should perform it on the known hlua state, not on a potential substate created using coroutine.create() from an existing hlua state from lua script. Indeed, only true hlua couroutines will properly handle the yield and perform the required timeout checks when returning in hlua_ctx_resume(). So far, this was not a concern because hlua_gethlua() would return NULL if hlua_hook() is not directly being called from a hlua coroutine anyway. But with this we're trying to make hlua_hook() ready for being called from a subcoroutine which inherits from a parent hlua ctx. In this case, no yield attempt will be performed, we will simply check for hlua timeouts. Not doing so would result in the timeout checks not being performed since hlua_ctx_resume() is completely bypassed when yielding from the subroutine, resulting in a user-defined coroutine potentially going rogue unnoticed.	2023-04-19 11:03:31 +02:00
Aurelien DARRAGON	2a9764baae	CLEANUP: hlua: avoid confusion between internal timers and tick based timers Not all hlua "time" variables use the same time logic. hlua->wake_time relies on ticks since its meant to be used in conjunction with task scheduling. Thus, it should be stored as a signed int and manipulated using the tick api. Adding a few comments about that to prevent mixups with hlua internal timer api which doesn't rely on the ticks api.	2023-04-19 11:03:31 +02:00
Aurelien DARRAGON	58e36e5b14	MEDIUM: hlua: introduce tune.lua.burst-timeout The "burst" execution timeout applies to any Lua handler. If the handler fails to finish or yield before timeout is reached, handler will be aborted to prevent thread contention, to prevent traffic from not being served for too long, and ultimately to prevent the process from crashing because of the watchdog kicking in. Default value is 1000ms. Combined with forced-yield default value of 10000 lua instructions, it should be high enough to prevent any existing script breakage, while still being able to catch slow lua converters or sample fetches doing thread contention and risking the process stability. Setting value to 0 completely bypasses this check. (not recommended but could be required to restore original behavior if this feature breaks existing setups somehow...) No backport needed, although it could be used to prevent watchdog crashes due to poorly coded (slow/cpu consuming) lua sample fetches/converters.	2023-04-19 11:03:31 +02:00

1 2 3 4 5 ...

19910 Commits