This patch adds very basic support for getting information about
devlink devices which are typically PCI devices which exposes Networking
switch or legacy devices.
This information includes bus name, device name and eswitch modes.
This is done through devlink family of commands via generic netlink
sockets provided by Linux kernel.
DevlinkDevice represents a devlink device which is identified by bus
name and device name (unlike interface index for netdevices).
It contains the DevlinkDevAttrs device attributes.
Currently only eswitch attributes are queried. In future more attributes
such as port, shared buffer, traffic class will be added.
Signed-off-by: Parav Pandit <parav@mellanox.com>
Testing and functionality for the use of HFSC has been implemented.
The use of service curves is implenented closely as to how they behave
with the TC implementation.
Automated checks and testing were succesful.
Closes#354
Previous attemt to fix#354 was only hiding a true issue with too small
buffer to pick up the message from kernel.
According to https://github.com/vishvananda/netlink/issues/354#issuecomment-401559441
such situation could occur not only during dump of VF list, but also
* statistics
* tc rules and tc filters
* large conn track dump
* rdma resource details dump for debugging
or any other place where kernel can return more data than default (4kB)
sized buffer could hold.
iproute2 in this case for rtnl_dump_filter_l has buffer with size of
16kB, but we don't have distinction between different receiving funcs,
so I'm proposing to stick with original issue cause finder (kudos to
Parav Pandit aka paravmellanox) who is proposing 64kB as a buffer size.
This patch adds very basic support for getting information about RDMA
networking device; starting with device index, name, firmware version,
node GUID and system image GUID.
This is done through RDMA netlink socket.
RDMA devices are some what similar to Ethernet devices.
However there are few major differences between them.
RDMA devices usually have one or two ports, unlike Ethernet devices.
Each port has its own attributes, state and network addresses which are
different than Ethernet devices (Link and LinkAttrs). They almost don't
overlap with Link and LinkAttrs.
Therefore it doesn't derive Link and LinkAttrs structure; instead they
are represented using RdmaLink and RdmaLinkAttrs.
RdmaLink represents a RDMA device containing its attributes.
All Rdma device communication occurs through rdma subsystem's netlink
socket.
Signed-off-by: Parav Pandit parav@mellanox.com
This patch adds ClassStatistics, a struct that represents the stats
of a class based on genric networking stats for netlink, to ClassAttrs.
The parsers for rtattrs in type of TCA_STATS and TCA_STATS2 are
introduced as well and the stats are appropriately parsed as a part
of ClassAttrs struct.
The practical tests for stats are not contained in this patch yet since
it requires the actual packet sending/receiving in the random timing,
which makes the tests complicated and flaky. Once we figure it out how
to test them in the proper way, they shall be added.
Signed-off-by: Taku Fukushima <taku@soracom.jp>
Add support for setting InfiniBand Node and Port GUID address
configuration of a VF when InfiniBand HCA are used with SR-IOV mode.
Signed-off-by: Parav Pandit <parav@mellanox.com>
The IFLA_* constants in in x/sys/unix were updated to Linux 4.15 in
golang/sys@88d2dcc510, so use these instead of locally duplicating
them.
Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
This test spawns a go routine that subscribe for some
events while the main thread will close the socket.
The go routine will returns after 5s when the timetout
on the recv fires and the fd is actually == -1
Signed-off-by: Flavio Crisciani <flavio.crisciani@docker.com>
If the socket is closed the recv that are waiting for messages
are not woken up. The result especially for Subscribe socket is
most likely a go routine leak.
This commit introduces a method to set the timeout
Signed-off-by: Flavio Crisciani <flavio.crisciani@docker.com>
Add support for setting trust state of a VF. This allows restricting
certain operations on VF when its untrusted such as disabling
promiscuous mode.
Signed-off-by: Parav Pandit <parav@mellanox.com>
Add bond parameters corresponding to:
* IFLA_BOND_AD_ACTOR_SYS_PRIO
* IFLA_BOND_AD_USER_PORT_KEY
* IFLA_BOND_AD_ACTOR_SYSTEM
* IFLA_BOND_TLB_DYNAMIC_LB
These are available in new(ish) kernels.
* Multicast snooping and hello time are the only ones supported at the
moment
* Only pass values to kernel when user sets them, otherwise let kernel
decide default
* Can set multicast snooping on existing bridges
* Tests disabled on Travis CI as the kernel version is too old
* All bridge flags copied from Kernel code, but only the two mentioned
above work
(5a7ad1146c/include/uapi/linux/if_link.h (L232-L281))
Signed-off-by: Petar Petrov <pppepito86@gmail.com>
Signed-off-by: Ed King <eking@pivotal.io>
Signed-off-by: Konstantinos Karampogias <konstantinos.karampogias@swisscom.com>
Signed-off-by: Will Martin <wmartin@pivotal.io>
Bridge ports can be set to use the proxy arp features by calling
either LinkSetBrProxyArp() or LinkSetBrProxyArpWiFi().
Signed-off-by: David Wilder <wilder@us.ibm.com>
This adds parsing of the preferred and valid lifetime information from the
netlink IFA_CACHEINFO attribute. They are stored as PreferedLft and ValidLft in
the Addr struct if found.
Signed-off-by: Toke Høiland-Jørgensen <toke@toke.dk>
- Conntrack table FLUSH
- Conntrack table DELETE with filter
The filter is only for IP field
- Conntrack table GET
The flow information is not complete, but the method
returns a simplified structure with basic flow info
Signed-off-by: Flavio Crisciani <flavio.crisciani@docker.com>
$ ip -M route add 100 dev eth0
$ ip -M route add 100 as to 200/300 dev eth0
$ ip -M route add 100 nexthop dev eth0 as to 200 \
nexthop dev eth1 as to 300
$ ip route add 10.10.0.0/24 encap mpls 200/300 dev eth0
$ ip route add 10.0.0.0/24 nexthop encap mpls 200 dev eth0 \
nexthop encap mpls 300 dev eth1
Signed-off-by: ISHIDA Wataru <ishida.wataru@lab.ntt.co.jp>
The go get command and make both fail when executed on
non-linux platforms. Modified it so that there are no
compilation errors when developing in such an
environment.