summaryrefslogtreecommitdiffstats
path: root/arch/x86/kernel/cpu
Commit message (Collapse)AuthorAgeFilesLines
...
| * | | | x86/cpu/AMD: Remove the pointless detect_ht() callThomas Gleixner2018-06-211-4/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Real 32bit AMD CPUs do not have SMT and the only value of the call was to reach the magic printout which got removed. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Acked-by: Ingo Molnar <mingo@kernel.org>
| * | | | x86/cpu: Remove the pointless CPU printoutThomas Gleixner2018-06-212-25/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The value of this printout is dubious at best and there is no point in having it in two different places along with convoluted ways to reach it. Remove it completely. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Acked-by: Ingo Molnar <mingo@kernel.org>
| * | | | x86/bugs: Move the l1tf function and define pr_fmt properlyKonrad Rzeszutek Wilk2018-06-211-26/+29
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The pr_warn in l1tf_select_mitigation would have used the prior pr_fmt which was defined as "Spectre V2 : ". Move the function to be past SSBD and also define the pr_fmt. Fixes: 17dbca119312 ("x86/speculation/l1tf: Add sysfs reporting for l1tf") Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
| * | | | x86/speculation/l1tf: Add sysfs reporting for l1tfAndi Kleen2018-06-202-0/+60
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | L1TF core kernel workarounds are cheap and normally always enabled, However they still should be reported in sysfs if the system is vulnerable or mitigated. Add the necessary CPU feature/bug bits. - Extend the existing checks for Meltdowns to determine if the system is vulnerable. All CPUs which are not vulnerable to Meltdown are also not vulnerable to L1TF - Check for 32bit non PAE and emit a warning as there is no practical way for mitigation due to the limited physical address bits - If the system has more than MAX_PA/2 physical memory the invert page workarounds don't protect the system against the L1TF attack anymore, because an inverted physical address will also point to valid memory. Print a warning in this case and report that the system is vulnerable. Add a function which returns the PFN limit for the L1TF mitigation, which will be used in follow up patches for sanity and range checks. [ tglx: Renamed the CPU feature bit to L1TF_PTEINV ] Signed-off-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Josh Poimboeuf <jpoimboe@redhat.com> Acked-by: Dave Hansen <dave.hansen@intel.com>
* | | | | Merge branch 'x86-timers-for-linus' of ↵Linus Torvalds2018-08-132-25/+28
|\ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 timer updates from Thomas Gleixner: "Early TSC based time stamping to allow better boot time analysis. This comes with a general cleanup of the TSC calibration code which grew warts and duct taping over the years and removes 250 lines of code. Initiated and mostly implemented by Pavel with help from various folks" * 'x86-timers-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (37 commits) x86/kvmclock: Mark kvm_get_preset_lpj() as __init x86/tsc: Consolidate init code sched/clock: Disable interrupts when calling generic_sched_clock_init() timekeeping: Prevent false warning when persistent clock is not available sched/clock: Close a hole in sched_clock_init() x86/tsc: Make use of tsc_calibrate_cpu_early() x86/tsc: Split native_calibrate_cpu() into early and late parts sched/clock: Use static key for sched_clock_running sched/clock: Enable sched clock early sched/clock: Move sched clock initialization and merge with generic clock x86/tsc: Use TSC as sched clock early x86/tsc: Initialize cyc2ns when tsc frequency is determined x86/tsc: Calibrate tsc only once ARM/time: Remove read_boot_clock64() s390/time: Remove read_boot_clock64() timekeeping: Default boot time offset to local_clock() timekeeping: Replace read_boot_clock64() with read_persistent_wall_and_boot_offset() s390/time: Add read_persistent_wall_and_boot_offset() x86/xen/time: Output xen sched_clock time from 0 x86/xen/time: Initialize pv xen time in init_hypervisor_platform() ...
| * | | | | x86/CPU: Call detect_nopl() only on the BSPBorislav Petkov2018-07-201-6/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Make it use the setup_* variants and have it be called only on the BSP and drop the call in generic_identify() - X86_FEATURE_NOPL will be replicated to the APs through the forced caps. Helps to keep the mess at a manageable level. Signed-off-by: Borislav Petkov <bp@suse.de> Signed-off-by: Pavel Tatashin <pasha.tatashin@oracle.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: steven.sistare@oracle.com Cc: daniel.m.jordan@oracle.com Cc: linux@armlinux.org.uk Cc: schwidefsky@de.ibm.com Cc: heiko.carstens@de.ibm.com Cc: john.stultz@linaro.org Cc: sboyd@codeaurora.org Cc: hpa@zytor.com Cc: douly.fnst@cn.fujitsu.com Cc: peterz@infradead.org Cc: prarit@redhat.com Cc: feng.tang@intel.com Cc: pmladek@suse.com Cc: gnomes@lxorguk.ukuu.org.uk Cc: linux-s390@vger.kernel.org Cc: boris.ostrovsky@oracle.com Cc: jgross@suse.com Cc: pbonzini@redhat.com Link: https://lkml.kernel.org/r/20180719205545.16512-11-pasha.tatashin@oracle.com
| * | | | | x86/jump_label: Initialize static branching earlyPavel Tatashin2018-07-202-23/+28
| | |_|/ / | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Static branching is useful to runtime patch branches that are used in hot path, but are infrequently changed. The x86 clock framework is one example that uses static branches to setup the best clock during boot and never changes it again. It is desired to enable the TSC based sched clock early to allow fine grained boot time analysis early on. That requires the static branching functionality to be functional early as well. Static branching requires patching nop instructions, thus, arch_init_ideal_nops() must be called prior to jump_label_init(). Do all the necessary steps to call arch_init_ideal_nops() right after early_cpu_init(), which also allows to insert a call to jump_label_init() right after that. jump_label_init() will be called again from the generic init code, but the code is protected against reinitialization already. [ tglx: Massaged changelog ] Suggested-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Pavel Tatashin <pasha.tatashin@oracle.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Borislav Petkov <bp@suse.de> Cc: steven.sistare@oracle.com Cc: daniel.m.jordan@oracle.com Cc: linux@armlinux.org.uk Cc: schwidefsky@de.ibm.com Cc: heiko.carstens@de.ibm.com Cc: john.stultz@linaro.org Cc: sboyd@codeaurora.org Cc: hpa@zytor.com Cc: douly.fnst@cn.fujitsu.com Cc: prarit@redhat.com Cc: feng.tang@intel.com Cc: pmladek@suse.com Cc: gnomes@lxorguk.ukuu.org.uk Cc: linux-s390@vger.kernel.org Cc: boris.ostrovsky@oracle.com Cc: jgross@suse.com Cc: pbonzini@redhat.com Link: https://lkml.kernel.org/r/20180719205545.16512-10-pasha.tatashin@oracle.com
* | | | | Merge branch 'x86/pti' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tipLinus Torvalds2018-08-132-35/+31
|\ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull x86 PTI updates from Thomas Gleixner: "The Speck brigade sadly provides yet another large set of patches destroying the perfomance which we carefully built and preserved - PTI support for 32bit PAE. The missing counter part to the 64bit PTI code implemented by Joerg. - A set of fixes for the Global Bit mechanics for non PCID CPUs which were setting the Global Bit too widely and therefore possibly exposing interesting memory needlessly. - Protection against userspace-userspace SpectreRSB - Support for the upcoming Enhanced IBRS mode, which is preferred over IBRS. Unfortunately we dont know the performance impact of this, but it's expected to be less horrible than the IBRS hammering. - Cleanups and simplifications" * 'x86/pti' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (60 commits) x86/mm/pti: Move user W+X check into pti_finalize() x86/relocs: Add __end_rodata_aligned to S_REL x86/mm/pti: Clone kernel-image on PTE level for 32 bit x86/mm/pti: Don't clear permissions in pti_clone_pmd() x86/mm/pti: Fix 32 bit PCID check x86/mm/init: Remove freed kernel image areas from alias mapping x86/mm/init: Add helper for freeing kernel image pages x86/mm/init: Pass unconverted symbol addresses to free_init_pages() mm: Allow non-direct-map arguments to free_reserved_area() x86/mm/pti: Clear Global bit more aggressively x86/speculation: Support Enhanced IBRS on future CPUs x86/speculation: Protect against userspace-userspace spectreRSB x86/kexec: Allocate 8k PGDs for PTI Revert "perf/core: Make sure the ring-buffer is mapped in all page-tables" x86/mm: Remove in_nmi() warning from vmalloc_fault() x86/entry/32: Check for VM86 mode in slow-path check perf/core: Make sure the ring-buffer is mapped in all page-tables x86/pti: Check the return value of pti_user_pagetable_walk_pmd() x86/pti: Check the return value of pti_user_pagetable_walk_p4d() x86/entry/32: Add debug code to check entry/exit CR3 ...
| * \ \ \ \ Merge branch 'x86/pti-urgent' into x86/ptiThomas Gleixner2018-08-061-3/+0
| |\ \ \ \ \ | | | |_|/ / | | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | Integrate the PTI Global bit fixes which conflict with the 32bit PTI support. Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
| * | | | | x86/speculation: Support Enhanced IBRS on future CPUsSai Praneeth2018-08-032-2/+21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Future Intel processors will support "Enhanced IBRS" which is an "always on" mode i.e. IBRS bit in SPEC_CTRL MSR is enabled once and never disabled. From the specification [1]: "With enhanced IBRS, the predicted targets of indirect branches executed cannot be controlled by software that was executed in a less privileged predictor mode or on another logical processor. As a result, software operating on a processor with enhanced IBRS need not use WRMSR to set IA32_SPEC_CTRL.IBRS after every transition to a more privileged predictor mode. Software can isolate predictor modes effectively simply by setting the bit once. Software need not disable enhanced IBRS prior to entering a sleep state such as MWAIT or HLT." If Enhanced IBRS is supported by the processor then use it as the preferred spectre v2 mitigation mechanism instead of Retpoline. Intel's Retpoline white paper [2] states: "Retpoline is known to be an effective branch target injection (Spectre variant 2) mitigation on Intel processors belonging to family 6 (enumerated by the CPUID instruction) that do not have support for enhanced IBRS. On processors that support enhanced IBRS, it should be used for mitigation instead of retpoline." The reason why Enhanced IBRS is the recommended mitigation on processors which support it is that these processors also support CET which provides a defense against ROP attacks. Retpoline is very similar to ROP techniques and might trigger false positives in the CET defense. If Enhanced IBRS is selected as the mitigation technique for spectre v2, the IBRS bit in SPEC_CTRL MSR is set once at boot time and never cleared. Kernel also has to make sure that IBRS bit remains set after VMEXIT because the guest might have cleared the bit. This is already covered by the existing x86_spec_ctrl_set_guest() and x86_spec_ctrl_restore_host() speculation control functions. Enhanced IBRS still requires IBPB for full mitigation. [1] Speculative-Execution-Side-Channel-Mitigations.pdf [2] Retpoline-A-Branch-Target-Injection-Mitigation.pdf Both documents are available at: https://bugzilla.kernel.org/show_bug.cgi?id=199511 Originally-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Sai Praneeth Prakhya <sai.praneeth.prakhya@intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Tim C Chen <tim.c.chen@intel.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Ravi Shankar <ravi.v.shankar@intel.com> Link: https://lkml.kernel.org/r/1533148945-24095-1-git-send-email-sai.praneeth.prakhya@intel.com
| * | | | | x86/speculation: Protect against userspace-userspace spectreRSBJiri Kosina2018-07-311-31/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The article "Spectre Returns! Speculation Attacks using the Return Stack Buffer" [1] describes two new (sub-)variants of spectrev2-like attacks, making use solely of the RSB contents even on CPUs that don't fallback to BTB on RSB underflow (Skylake+). Mitigate userspace-userspace attacks by always unconditionally filling RSB on context switch when the generic spectrev2 mitigation has been enabled. [1] https://arxiv.org/pdf/1807.07940.pdf Signed-off-by: Jiri Kosina <jkosina@suse.cz> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Josh Poimboeuf <jpoimboe@redhat.com> Acked-by: Tim Chen <tim.c.chen@linux.intel.com> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Borislav Petkov <bp@suse.de> Cc: David Woodhouse <dwmw@amazon.co.uk> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/nycvar.YFH.7.76.1807261308190.997@cbobk.fhfr.pm
| * | | | | x86/entry/32: Enter the kernel via trampoline stackJoerg Roedel2018-07-201-2/+3
| | |/ / / | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Use the entry-stack as a trampoline to enter the kernel. The entry-stack is already in the cpu_entry_area and will be mapped to userspace when PTI is enabled. Signed-off-by: Joerg Roedel <jroedel@suse.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Pavel Machek <pavel@ucw.cz> Cc: "H . Peter Anvin" <hpa@zytor.com> Cc: linux-mm@kvack.org Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Juergen Gross <jgross@suse.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Jiri Kosina <jkosina@suse.cz> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: Brian Gerst <brgerst@gmail.com> Cc: David Laight <David.Laight@aculab.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: Eduardo Valentin <eduval@amazon.com> Cc: Greg KH <gregkh@linuxfoundation.org> Cc: Will Deacon <will.deacon@arm.com> Cc: aliguori@amazon.com Cc: daniel.gruss@iaik.tugraz.at Cc: hughd@google.com Cc: keescook@google.com Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Waiman Long <llong@redhat.com> Cc: "David H . Gutteridge" <dhgutteridge@sympatico.ca> Cc: joro@8bytes.org Link: https://lkml.kernel.org/r/1531906876-13451-8-git-send-email-joro@8bytes.org
* | | | | Merge branch 'x86-cache-for-linus' of ↵Linus Torvalds2018-08-137-72/+2588
|\ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 cache QoS (RDT/CAR) updates from Thomas Gleixner: "Add support for pseudo-locked cache regions. Cache Allocation Technology (CAT) allows on certain CPUs to isolate a region of cache and 'lock' it. Cache pseudo-locking builds on the fact that a CPU can still read and write data pre-allocated outside its current allocated area on cache hit. With cache pseudo-locking data can be preloaded into a reserved portion of cache that no application can fill, and from that point on will only serve cache hits. The cache pseudo-locked memory is made accessible to user space where an application can map it into its virtual address space and thus have a region of memory with reduced average read latency. The locking is not perfect and gets totally screwed by WBINDV and similar mechanisms, but it provides a reasonable enhancement for certain types of latency sensitive applications. The implementation extends the current CAT mechanism and provides a generally useful exclusive CAT mode on which it builds the extra pseude-locked regions" * 'x86-cache-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (45 commits) x86/intel_rdt: Disable PMU access x86/intel_rdt: Fix possible circular lock dependency x86/intel_rdt: Make CPU information accessible for pseudo-locked regions x86/intel_rdt: Support restoration of subset of permissions x86/intel_rdt: Fix cleanup of plr structure on error x86/intel_rdt: Move pseudo_lock_region_clear() x86/intel_rdt: Limit C-states dynamically when pseudo-locking active x86/intel_rdt: Support L3 cache performance event of Broadwell x86/intel_rdt: More precise L2 hit/miss measurements x86/intel_rdt: Create character device exposing pseudo-locked region x86/intel_rdt: Create debugfs files for pseudo-locking testing x86/intel_rdt: Create resctrl debug area x86/intel_rdt: Ensure RDT cleanup on exit x86/intel_rdt: Resctrl files reflect pseudo-locked information x86/intel_rdt: Support creation/removal of pseudo-locked region x86/intel_rdt: Pseudo-lock region creation/removal core x86/intel_rdt: Discover supported platforms via prefetch disable bits x86/intel_rdt: Add utilities to test pseudo-locked region possibility x86/intel_rdt: Split resource group removal in two x86/intel_rdt: Enable entering of pseudo-locksetup mode ...
| * | | | | x86/intel_rdt: Disable PMU accessThomas Gleixner2018-08-031-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Peter is objecting to the direct PMU access in RDT. Right now the PMU usage is broken anyway as it is not coordinated with perf. Until this discussion settled, disable the PMU mechanics by simply rejecting the type '2' measurement in the resctrl file. Reported-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Reinette Chatre <reinette.chatre@intel.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: fenghua.yu@intel.com Cc: tony.luck@intel.com Cc: vikas.shivappa@linux.intel.com CC: gavin.hindman@intel.com Cc: jithu.joseph@intel.com Cc: hpa@zytor.com
| * | | | | x86/intel_rdt: Fix possible circular lock dependencyReinette Chatre2018-07-121-15/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Lockdep is reporting a possible circular locking dependency: ====================================================== WARNING: possible circular locking dependency detected 4.18.0-rc1-test-test+ #4 Not tainted ------------------------------------------------------ user_example/766 is trying to acquire lock: 0000000073479a0f (rdtgroup_mutex){+.+.}, at: pseudo_lock_dev_mmap but task is already holding lock: 000000001ef7a35b (&mm->mmap_sem){++++}, at: vm_mmap_pgoff+0x9f/0x which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #2 (&mm->mmap_sem){++++}: _copy_to_user+0x1e/0x70 filldir+0x91/0x100 dcache_readdir+0x54/0x160 iterate_dir+0x142/0x190 __x64_sys_getdents+0xb9/0x170 do_syscall_64+0x86/0x200 entry_SYSCALL_64_after_hwframe+0x49/0xbe -> #1 (&sb->s_type->i_mutex_key#3){++++}: start_creating+0x60/0x100 debugfs_create_dir+0xc/0xc0 rdtgroup_pseudo_lock_create+0x217/0x4d0 rdtgroup_schemata_write+0x313/0x3d0 kernfs_fop_write+0xf0/0x1a0 __vfs_write+0x36/0x190 vfs_write+0xb7/0x190 ksys_write+0x52/0xc0 do_syscall_64+0x86/0x200 entry_SYSCALL_64_after_hwframe+0x49/0xbe -> #0 (rdtgroup_mutex){+.+.}: __mutex_lock+0x80/0x9b0 pseudo_lock_dev_mmap+0x2f/0x170 mmap_region+0x3d6/0x610 do_mmap+0x387/0x580 vm_mmap_pgoff+0xcf/0x110 ksys_mmap_pgoff+0x170/0x1f0 do_syscall_64+0x86/0x200 entry_SYSCALL_64_after_hwframe+0x49/0xbe other info that might help us debug this: Chain exists of: rdtgroup_mutex --> &sb->s_type->i_mutex_key#3 --> &mm->mmap_sem Possible unsafe locking scenario: CPU0 CPU1 ---- ---- lock(&mm->mmap_sem); lock(&sb->s_type->i_mutex_key#3); lock(&mm->mmap_sem); lock(rdtgroup_mutex); *** DEADLOCK *** 1 lock held by user_example/766: #0: 000000001ef7a35b (&mm->mmap_sem){++++}, at: vm_mmap_pgoff+0x9f/0x110 rdtgroup_mutex is already being released temporarily during pseudo-lock region creation to prevent the potential deadlock between rdtgroup_mutex and mm->mmap_sem that is obtained during device_create(). Move the debugfs creation into this area to avoid the same circular dependency. Signed-off-by: Reinette Chatre <reinette.chatre@intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: fenghua.yu@intel.com Cc: tony.luck@intel.com Cc: vikas.shivappa@linux.intel.com Cc: gavin.hindman@intel.com Cc: jithu.joseph@intel.com Cc: hpa@zytor.com Link: https://lkml.kernel.org/r/fffb57f9c6b8285904c9a60cc91ce21591af17fe.1531332480.git.reinette.chatre@intel.com
| * | | | | x86/intel_rdt: Make CPU information accessible for pseudo-locked regionsReinette Chatre2018-07-032-2/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When a resource group enters pseudo-locksetup mode it reflects that the platform supports cache pseudo-locking and the resource group is unused, ready to be used for a pseudo-locked region. Until it is set up as a pseudo-locked region the resource group is "locked down" such that no new tasks or cpus can be assigned to it. This is accomplished in a user visible way by making the cpus, cpus_list, and tasks resctrl files inaccassible (user cannot read from or write to these files). When the resource group changes to pseudo-locked mode it represents a cache pseudo-locked region. While not appropriate to make any changes to the cpus assigned to this region it is useful to make it easy for the user to see which cpus are associated with the pseudo-locked region. Modify the permissions of the cpus/cpus_list file when the resource group changes to pseudo-locked mode to support reading (not writing). The information presented to the user when reading the file are the cpus associated with the pseudo-locked region. Signed-off-by: Reinette Chatre <reinette.chatre@intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: fenghua.yu@intel.com Cc: tony.luck@intel.com Cc: vikas.shivappa@linux.intel.com Cc: gavin.hindman@intel.com Cc: jithu.joseph@intel.com Cc: dave.hansen@intel.com Cc: hpa@zytor.com Link: https://lkml.kernel.org/r/12756b7963b6abc1bffe8fb560b87b75da827bd1.1530421961.git.reinette.chatre@intel.com
| * | | | | x86/intel_rdt: Support restoration of subset of permissionsReinette Chatre2018-07-033-10/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | As the mode of a resource group changes, the operations it can support may also change. One way in which the supported operations are managed is to modify the permissions of the files within the resource group's resctrl directory. At the moment only two possible permissions are supported: the default permissions or no permissions in support for when the operation is "locked down". It is possible where an operation on a resource group may have more possibilities. For example, if by default changes can be made to the resource group by writing to a resctrl file while the current settings can be obtained by reading from the file, then it may be possible that in another mode it is only possible to read the current settings, and not change them. Make it possible to modify some of the permissions of a resctrl file in support of a more flexible way to manage the operations on a resource group. In this preparation work the original behavior is maintained where all permissions are restored. Signed-off-by: Reinette Chatre <reinette.chatre@intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: fenghua.yu@intel.com Cc: tony.luck@intel.com Cc: vikas.shivappa@linux.intel.com Cc: gavin.hindman@intel.com Cc: jithu.joseph@intel.com Cc: dave.hansen@intel.com Cc: hpa@zytor.com Link: https://lkml.kernel.org/r/8773aadfade7bcb2c48a45fa294a04d2c03bb0a1.1530421961.git.reinette.chatre@intel.com
| * | | | | x86/intel_rdt: Fix cleanup of plr structure on errorReinette Chatre2018-07-031-5/+17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When a resource group enters pseudo-locksetup mode a pseudo_lock_region is associated with it. When the user writes to the resource group's schemata file the CBM of the requested pseudo-locked region is entered into the pseudo_lock_region struct. If any part of pseudo-lock region creation fails the resource group will remain in pseudo-locksetup mode with the pseudo_lock_region associated with it. In case of failure during pseudo-lock region creation care needs to be taken to ensure that the pseudo_lock_region struct associated with the resource group is cleared from any pseudo-locking data - especially the CBM. This is because the existence of a pseudo_lock_region struct with a CBM is significant in other areas of the code, for example, the display of bit_usage and initialization of a new resource group. Fix the error path of pseudo-lock region creation to ensure that the pseudo_lock_region struct is cleared at each error exit. Fixes: 018961ae5579 ("x86/intel_rdt: Pseudo-lock region creation/removal core") Signed-off-by: Reinette Chatre <reinette.chatre@intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: fenghua.yu@intel.com Cc: tony.luck@intel.com Cc: vikas.shivappa@linux.intel.com Cc: gavin.hindman@intel.com Cc: jithu.joseph@intel.com Cc: dave.hansen@intel.com Cc: hpa@zytor.com Link: https://lkml.kernel.org/r/49b4782f6d204d122cee3499e642b2772a98d2b4.1530421026.git.reinette.chatre@intel.com
| * | | | | x86/intel_rdt: Move pseudo_lock_region_clear()Reinette Chatre2018-07-031-23/+23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The pseudo_lock_region_clear() function is moved to earlier in the file in preparation for its use in functions that currently appear before it. No functional change. Signed-off-by: Reinette Chatre <reinette.chatre@intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: fenghua.yu@intel.com Cc: tony.luck@intel.com Cc: vikas.shivappa@linux.intel.com Cc: gavin.hindman@intel.com Cc: jithu.joseph@intel.com Cc: dave.hansen@intel.com Cc: hpa@zytor.com Link: https://lkml.kernel.org/r/ef098ec2a45501e23792289bff80ae3152141e2f.1530421026.git.reinette.chatre@intel.com
| * | | | | x86/intel_rdt: Limit C-states dynamically when pseudo-locking activeReinette Chatre2018-06-242-2/+85
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Deeper C-states impact cache content through shrinking of the cache or flushing entire cache to memory before reducing power to the cache. Deeper C-states will thus negatively impact the pseudo-locked regions. To avoid impacting pseudo-locked regions C-states are limited on pseudo-locked region creation so that cores associated with the pseudo-locked region are prevented from entering deeper C-states. This is accomplished by requesting a CPU latency target which will prevent the core from entering C6 across all supported platforms. Signed-off-by: Reinette Chatre <reinette.chatre@intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: fenghua.yu@intel.com Cc: tony.luck@intel.com Cc: vikas.shivappa@linux.intel.com Cc: gavin.hindman@intel.com Cc: jithu.joseph@intel.com Cc: dave.hansen@intel.com Cc: hpa@zytor.com Link: https://lkml.kernel.org/r/1ef4f99dd6ba12fa6fb44c5a1141e75f952b9cd9.1529706536.git.reinette.chatre@intel.com
| * | | | | x86/intel_rdt: Support L3 cache performance event of BroadwellReinette Chatre2018-06-242-0/+66
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Broadwell microarchitecture supports pseudo-locking. Add support for the L3 cache related performance events of these systems so that the success of pseudo-locking can be measured more accurately on these platforms. Signed-off-by: Reinette Chatre <reinette.chatre@intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: fenghua.yu@intel.com Cc: tony.luck@intel.com Cc: vikas.shivappa@linux.intel.com Cc: gavin.hindman@intel.com Cc: jithu.joseph@intel.com Cc: dave.hansen@intel.com Cc: hpa@zytor.com Link: https://lkml.kernel.org/r/36c1414e9bd17c3faf440f32b644b9c879bcbae2.1529706536.git.reinette.chatre@intel.com
| * | | | | x86/intel_rdt: More precise L2 hit/miss measurementsReinette Chatre2018-06-242-9/+146
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Intel Goldmont processors supports non-architectural precise events that can be used to give us more insight into the success of L2 cache pseudo-locking on these platforms. Introduce a new measurement trigger that will enable two precise events, MEM_LOAD_UOPS_RETIRED.L2_HIT and MEM_LOAD_UOPS_RETIRED.L2_MISS, while accessing pseudo-locked data. A new tracepoint, pseudo_lock_l2, is created to make these results visible to the user. Signed-off-by: Reinette Chatre <reinette.chatre@intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: fenghua.yu@intel.com Cc: tony.luck@intel.com Cc: vikas.shivappa@linux.intel.com Cc: gavin.hindman@intel.com Cc: jithu.joseph@intel.com Cc: dave.hansen@intel.com Cc: hpa@zytor.com Link: https://lkml.kernel.org/r/06b1456da65b543479dac8d9493e41f92f175d6c.1529706536.git.reinette.chatre@intel.com
| * | | | | x86/intel_rdt: Create character device exposing pseudo-locked regionReinette Chatre2018-06-243-9/+291
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | After a pseudo-locked region is created it needs to be made available to user space for usage. A character device supporting mmap() is created for each pseudo-locked region. A user space application can now use mmap() system call to map pseudo-locked region into its virtual address space. Signed-off-by: Reinette Chatre <reinette.chatre@intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: fenghua.yu@intel.com Cc: tony.luck@intel.com Cc: vikas.shivappa@linux.intel.com Cc: gavin.hindman@intel.com Cc: jithu.joseph@intel.com Cc: dave.hansen@intel.com Cc: hpa@zytor.com Link: https://lkml.kernel.org/r/fccbb9b20f07655ab0a4df9fa1c1babc0288aea0.1529706536.git.reinette.chatre@intel.com
| * | | | | x86/intel_rdt: Create debugfs files for pseudo-locking testingReinette Chatre2018-06-234-1/+202
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There is no simple yes/no test to determine if pseudo-locking was successful. In order to test pseudo-locking we expose a debugfs file for each pseudo-locked region that will record the latency of reading the pseudo-locked memory at a stride of 32 bytes (hardcoded). These numbers will give us an idea of locking was successful or not since they will reflect cache hits and cache misses (hardware prefetching is disabled during the test). The new debugfs file "pseudo_lock_measure" will, when the pseudo_lock_mem_latency tracepoint is enabled, record the latency of accessing each cache line twice. Kernel tracepoints offer us histograms (when CONFIG_HIST_TRIGGERS is enabled) that is a simple way to visualize the memory access latency and immediately see any cache misses. For example, the hist trigger below before trigger of the measurement will display the memory access latency and instances at each latency: echo 'hist:keys=latency' > /sys/kernel/debug/tracing/events/resctrl/\ pseudo_lock_mem_latency/trigger echo 1 > /sys/kernel/debug/tracing/events/resctrl/pseudo_lock_mem_latency/enable echo 1 > /sys/kernel/debug/resctrl/<newlock>/pseudo_lock_measure echo 0 > /sys/kernel/debug/tracing/events/resctrl/pseudo_lock_mem_latency/enable cat /sys/kernel/debug/tracing/events/resctrl/pseudo_lock_mem_latency/hist Signed-off-by: Reinette Chatre <reinette.chatre@intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: fenghua.yu@intel.com Cc: tony.luck@intel.com Cc: vikas.shivappa@linux.intel.com Cc: gavin.hindman@intel.com Cc: jithu.joseph@intel.com Cc: dave.hansen@intel.com Cc: hpa@zytor.com Link: https://lkml.kernel.org/r/6b2ea76181099d1b79ccfa7d3be24497ab2d1a45.1529706536.git.reinette.chatre@intel.com
| * | | | | x86/intel_rdt: Create resctrl debug areaReinette Chatre2018-06-232-0/+29
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In preparation for support of debugging of RDT sub features the user can now enable a RDT debugfs region. The debug area is always enabled when CONFIG_DEBUG_FS is set as advised in http://lkml.kernel.org/r/20180523080501.GA6822@kroah.com Also from same discussion in above linked email, no error checking on the debugfs creation return value since code should not behave differently when debugging passes or fails. Even on failure the returned value can be passed safely to other debugfs calls. Signed-off-by: Reinette Chatre <reinette.chatre@intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: fenghua.yu@intel.com Cc: tony.luck@intel.com Cc: vikas.shivappa@linux.intel.com Cc: gavin.hindman@intel.com Cc: jithu.joseph@intel.com Cc: dave.hansen@intel.com Cc: hpa@zytor.com Link: https://lkml.kernel.org/r/9f553faf30866a6317f1aaaa2fe9f92de66a10d2.1529706536.git.reinette.chatre@intel.com
| * | | | | x86/intel_rdt: Ensure RDT cleanup on exitReinette Chatre2018-06-233-2/+19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The RDT system's initialization does not have the corresponding exit handling to ensure everything initialized on load is cleaned up also. Introduce the cleanup routines that complement all initialization. This includes the removal of a duplicate rdtgroup_init() declaration. Signed-off-by: Reinette Chatre <reinette.chatre@intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: fenghua.yu@intel.com Cc: tony.luck@intel.com Cc: vikas.shivappa@linux.intel.com Cc: gavin.hindman@intel.com Cc: jithu.joseph@intel.com Cc: dave.hansen@intel.com Cc: hpa@zytor.com Link: https://lkml.kernel.org/r/a9e3a2bbd731d13915d2d7bf05d4f675b4fa109b.1529706536.git.reinette.chatre@intel.com
| * | | | | x86/intel_rdt: Resctrl files reflect pseudo-locked informationReinette Chatre2018-06-232-6/+28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Information about resources as well as resource groups are contained in a variety of resctrl files. Now that pseudo-locked regions can be created the files can be updated to present appropriate information to the user. Update the resource group's schemata file to show only the information of the pseudo-locked region. Update the resource group's size file to show the size in bytes of only the pseudo-locked region. Update the bit_usage file to use the letter 'P' for all pseudo-locked regions. Signed-off-by: Reinette Chatre <reinette.chatre@intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: fenghua.yu@intel.com Cc: tony.luck@intel.com Cc: vikas.shivappa@linux.intel.com Cc: gavin.hindman@intel.com Cc: jithu.joseph@intel.com Cc: dave.hansen@intel.com Cc: hpa@zytor.com Link: https://lkml.kernel.org/r/5ece82869b651c2178b278e00bca959f7626b6e9.1529706536.git.reinette.chatre@intel.com
| * | | | | x86/intel_rdt: Support creation/removal of pseudo-locked regionReinette Chatre2018-06-232-5/+62
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The user triggers the creation of a pseudo-locked region when writing a valid schemata to the schemata file of a resource group in the pseudo-locksetup mode. A valid schemata is one that: (1) does not overlap with any other resource group, (2) does not involve a cache that already contains a pseudo-locked region within its hierarchy. After a valid schemata is parsed the system is programmed to associate the to be pseudo-lock bitmask with the closid associated with the resource group. With the system set up the pseudo-locked region can be created. Signed-off-by: Reinette Chatre <reinette.chatre@intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: fenghua.yu@intel.com Cc: tony.luck@intel.com Cc: vikas.shivappa@linux.intel.com Cc: gavin.hindman@intel.com Cc: jithu.joseph@intel.com Cc: dave.hansen@intel.com Cc: hpa@zytor.com Link: https://lkml.kernel.org/r/8929c3a9e2ba600e79649abe584aa28b8d0ff639.1529706536.git.reinette.chatre@intel.com
| * | | | | x86/intel_rdt: Pseudo-lock region creation/removal coreReinette Chatre2018-06-232-1/+353
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The user requests a pseudo-locked region by providing a schemata to a resource group that is in the pseudo-locksetup mode. This is the functionality that consumes the parsed user data and creates the pseudo-locked region. First, required information is deduced from user provided data. This includes, how much memory does the requested bitmask represent, which CPU the requested region is associated with, and what is the cache line size of that cache (to learn the stride needed for locking). Second, a contiguous block of memory matching the requested bitmask is allocated. Finally, pseudo-locking is performed. The resource group already has the allocation that reflects the requested bitmask. With this class of service active and interference minimized, the allocated memory is loaded into the cache. Signed-off-by: Reinette Chatre <reinette.chatre@intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: fenghua.yu@intel.com Cc: tony.luck@intel.com Cc: vikas.shivappa@linux.intel.com Cc: gavin.hindman@intel.com Cc: jithu.joseph@intel.com Cc: dave.hansen@intel.com Cc: hpa@zytor.com Link: https://lkml.kernel.org/r/67391160bbf06143bc62d856d3d234eb152008b7.1529706536.git.reinette.chatre@intel.com
| * | | | | x86/intel_rdt: Discover supported platforms via prefetch disable bitsReinette Chatre2018-06-231-0/+75
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Knowing the model specific prefetch disable bits is required to support cache pseudo-locking because the hardware prefetchers need to be disabled when the kernel memory is pseudo-locked to cache. We add these bits only for platforms known to support cache pseudo-locking. When the user requests locksetup mode to be entered it will fail if the prefetch disabling bits are not known for the platform. Signed-off-by: Reinette Chatre <reinette.chatre@intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: fenghua.yu@intel.com Cc: tony.luck@intel.com Cc: vikas.shivappa@linux.intel.com Cc: gavin.hindman@intel.com Cc: jithu.joseph@intel.com Cc: dave.hansen@intel.com Cc: hpa@zytor.com Link: https://lkml.kernel.org/r/3eef559aa9fd693a104ff99ff909cfee450c1695.1529706536.git.reinette.chatre@intel.com
| * | | | | x86/intel_rdt: Add utilities to test pseudo-locked region possibilityReinette Chatre2018-06-232-0/+76
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | A pseudo-locked region does not have a class of service associated with it and thus not tracked in the array of control values maintained as part of the domain. Even so, when the user provides a new bitmask for another resource group it needs to be checked for interference with existing pseudo-locked regions. Additionally only one pseudo-locked region can be created in any cache hierarchy. Introduce two utilities in support of above scenarios: (1) a utility that can be used to test if a given capacity bitmask overlaps with any pseudo-locked regions associated with a particular cache instance, (2) a utility that can be used to test if a pseudo-locked region exists within a particular cache hierarchy. Signed-off-by: Reinette Chatre <reinette.chatre@intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: fenghua.yu@intel.com Cc: tony.luck@intel.com Cc: vikas.shivappa@linux.intel.com Cc: gavin.hindman@intel.com Cc: jithu.joseph@intel.com Cc: dave.hansen@intel.com Cc: hpa@zytor.com Link: https://lkml.kernel.org/r/b8e31dbdcf22ddf71df46072647b47e7558abb32.1529706536.git.reinette.chatre@intel.com
| * | | | | x86/intel_rdt: Split resource group removal in twoReinette Chatre2018-06-231-9/+16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Resource groups used for pseudo-locking do not require the same work on removal as the other resource groups. The resource group removal is split in two in preparation for support of pseudo-locking resource groups. A single re-ordering occurs - the setting of the rdtgrp flag is moved to later. This flag is not used by any of the code between its original and new location. Signed-off-by: Reinette Chatre <reinette.chatre@intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: fenghua.yu@intel.com Cc: tony.luck@intel.com Cc: vikas.shivappa@linux.intel.com Cc: gavin.hindman@intel.com Cc: jithu.joseph@intel.com Cc: dave.hansen@intel.com Cc: hpa@zytor.com Link: https://lkml.kernel.org/r/c8cbf7a7c72480b39bb946a929dbae96c0f9aca1.1529706536.git.reinette.chatre@intel.com
| * | | | | x86/intel_rdt: Enable entering of pseudo-locksetup modeReinette Chatre2018-06-232-10/+47
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The user can request entering pseudo-locksetup mode by writing "pseudo-locksetup" to the mode file. Act on this request as well as support switching from a pseudo-locksetup mode (before pseudo-locked mode was entered). It is not supported to modify the mode once pseudo-locked mode has been entered. The schemata reflects the new mode by adding "uninitialized" to all resources. The size resctrl file reports zero for all cache domains in support of the uninitialized nature. Since there are no users of this class of service its allocations can be ignored when searching for appropriate default allocations for new resource groups. For the same reason resource groups in pseudo-locksetup mode are not considered when testing if new resource groups may overlap. Signed-off-by: Reinette Chatre <reinette.chatre@intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: fenghua.yu@intel.com Cc: tony.luck@intel.com Cc: vikas.shivappa@linux.intel.com Cc: gavin.hindman@intel.com Cc: jithu.joseph@intel.com Cc: dave.hansen@intel.com Cc: hpa@zytor.com Link: https://lkml.kernel.org/r/56f553334708022903c296284e62db3bbc1ff150.1529706536.git.reinette.chatre@intel.com
| * | | | | x86/intel_rdt: Support enter/exit of locksetup modeReinette Chatre2018-06-232-6/+183
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The locksetup mode is the way in which the user communicates that the resource group will be used for a pseudo-locked region. Locksetup mode should thus ensure that all restrictions on a resource group are met before locksetup mode can be entered. The resource group should also be configured to ensure that it cannot be modified in unsupported ways when a pseudo-locked region. Introduce the support where the request for entering locksetup mode can be validated. This includes: CDP is not active, no cpus or tasks are assigned to the resource group, monitoring is not in progress on the resource group. Once the resource group is determined ready for a pseudo-locked region it is configured to not allow future changes to these properties. Signed-off-by: Reinette Chatre <reinette.chatre@intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: fenghua.yu@intel.com Cc: tony.luck@intel.com Cc: vikas.shivappa@linux.intel.com Cc: gavin.hindman@intel.com Cc: jithu.joseph@intel.com Cc: dave.hansen@intel.com Cc: hpa@zytor.com Link: https://lkml.kernel.org/r/b120f71ced30116bcc6c6f651e8a7906ae6b903d.1529706536.git.reinette.chatre@intel.com
| * | | | | x86/intel_rdt: Introduce pseudo-locked regionReinette Chatre2018-06-231-23/+41
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | A pseudo-locked region is introduced representing an instance of a pseudo-locked cache region. Each cache instance (domain) can support one pseudo-locked region. Similarly a resource group can be used for one pseudo-locked region. Include a pointer to a pseudo-locked region from the domain and resource group structures. Signed-off-by: Reinette Chatre <reinette.chatre@intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: fenghua.yu@intel.com Cc: tony.luck@intel.com Cc: vikas.shivappa@linux.intel.com Cc: gavin.hindman@intel.com Cc: jithu.joseph@intel.com Cc: dave.hansen@intel.com Cc: hpa@zytor.com Link: https://lkml.kernel.org/r/9f69eb159051067703bcbc714de62e69874d5dee.1529706536.git.reinette.chatre@intel.com
| * | | | | x86/intel_rdt: Add check to determine if monitoring in progressReinette Chatre2018-06-231-0/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When a resource group is pseudo-locked it is orphaned without a class of service associated with it. We thus do not want any monitoring in progress on a resource group that will be used for pseudo-locking. Introduce a test that can be used to determine if pseudo-locking in progress on a resource group. Temporarily mark it as unused to avoid compile warnings until it is used. Signed-off-by: Reinette Chatre <reinette.chatre@intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: fenghua.yu@intel.com Cc: tony.luck@intel.com Cc: vikas.shivappa@linux.intel.com Cc: gavin.hindman@intel.com Cc: jithu.joseph@intel.com Cc: dave.hansen@intel.com Cc: hpa@zytor.com Link: https://lkml.kernel.org/r/14fd9494f87ca72a213b3a197d1172d4e66ae196.1529706536.git.reinette.chatre@intel.com
| * | | | | x86/intel_rdt: Utilities to restrict/restore access to specific filesReinette Chatre2018-06-232-1/+115
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In support of Cache Pseudo-Locking we need to restrict access to specific resctrl files to protect the state of a resource group used for pseudo-locking from being changed in unsupported ways. Introduce two utilities that can be used to either restrict or restore the access to all files irrelevant to cache pseudo-locking when pseudo-locking in progress for the resource group. At this time introduce a new source file, intel_rdt_pseudo_lock.c, that will contain most of the code related to cache pseudo-locking. Temporarily mark these new functions as unused to silence compile warnings until they are used. Signed-off-by: Reinette Chatre <reinette.chatre@intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: fenghua.yu@intel.com Cc: tony.luck@intel.com Cc: vikas.shivappa@linux.intel.com Cc: gavin.hindman@intel.com Cc: jithu.joseph@intel.com Cc: dave.hansen@intel.com Cc: hpa@zytor.com Link: https://lkml.kernel.org/r/ab6319d1244366be3f9b7f9fba1c3da4810a274b.1529706536.git.reinette.chatre@intel.com
| * | | | | x86/intel_rdt: Protect against resource group changes during lockingReinette Chatre2018-06-232-4/+38
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We intend to modify file permissions to make the "tasks", "cpus", and "cpus_list" not accessible to the user when cache pseudo-locking in progress. Even so, it is still possible for the user to force the file permissions (using chmod) to make them writeable. Similarly, directory permissions will be modified to prevent future monitor group creation but the user can override these restrictions also. Add additional checks to the files we intend to restrict to ensure that no modifications from user space are attempted while setting up a pseudo-locking or after a pseudo-locked region is set up. Signed-off-by: Reinette Chatre <reinette.chatre@intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: fenghua.yu@intel.com Cc: tony.luck@intel.com Cc: vikas.shivappa@linux.intel.com Cc: gavin.hindman@intel.com Cc: jithu.joseph@intel.com Cc: dave.hansen@intel.com Cc: hpa@zytor.com Link: https://lkml.kernel.org/r/0c5cb006e81ead0b8bfff2df530c5d3017fd31d1.1529706536.git.reinette.chatre@intel.com
| * | | | | x86/intel_rdt: Add utility to restrict/restore access to resctrl filesReinette Chatre2018-06-232-0/+97
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When a resource group is used for Cache Pseudo-Locking then the region of cache ends up being orphaned with no class of service referring to it. The resctrl files intended to manage how the classes of services are utilized thus become irrelevant. The fact that a resctrl file is not relevant can be communicated to the user by setting all of its permissions to zero. That is, its read, write, and execute permissions are unset for all users. Introduce two utilities, rdtgroup_kn_mode_restrict() and rdtgroup_kn_mode_restore(), that can be used to restrict and restore the permissions of a file or directory belonging to a resource group. Signed-off-by: Reinette Chatre <reinette.chatre@intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: fenghua.yu@intel.com Cc: tony.luck@intel.com Cc: vikas.shivappa@linux.intel.com Cc: gavin.hindman@intel.com Cc: jithu.joseph@intel.com Cc: dave.hansen@intel.com Cc: hpa@zytor.com Link: https://lkml.kernel.org/r/7afdbf5551b2f93cd45d61fbf5e01d87331f529a.1529706536.git.reinette.chatre@intel.com
| * | | | | x86/intel_rdt: Add utility to test if tasks assigned to resource groupReinette Chatre2018-06-232-0/+27
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In considering changes to a resource group it becomes necessary to know whether tasks have been assigned to the resource group in question. Introduce a new utility that can be used to check if any tasks have been assigned to a particular resource group. Signed-off-by: Reinette Chatre <reinette.chatre@intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: fenghua.yu@intel.com Cc: tony.luck@intel.com Cc: vikas.shivappa@linux.intel.com Cc: gavin.hindman@intel.com Cc: jithu.joseph@intel.com Cc: dave.hansen@intel.com Cc: hpa@zytor.com Link: https://lkml.kernel.org/r/be9ea3969ffd731dfd90c0ebcd5a0e0a2d135bb2.1529706536.git.reinette.chatre@intel.com
| * | | | | x86/intel_rdt: Respect read and write accessReinette Chatre2018-06-231-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | By default, if the opener has CAP_DAC_OVERRIDE, a kernfs file can be opened regardless of RW permissions. Writing to a kernfs file will thus succeed even if permissions are 0000. It's required to restrict the actions that can be performed on a resource group from userspace based on the mode of the resource group. This restriction will be done through a modification of the file permissions. That is, for example, if a resource group is locked then the user cannot add tasks to the resource group. For this restriction through file permissions to work it has to be ensured that the permissions are always respected. To do so the resctrl filesystem is created with the KERNFS_ROOT_EXTRA_OPEN_PERM_CHECK flag that will result in open(2) failing with -EACCESS regardless of CAP_DAC_OVERRIDE if the permission does not have the respective read or write access. Signed-off-by: Reinette Chatre <reinette.chatre@intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: fenghua.yu@intel.com Cc: tony.luck@intel.com Cc: vikas.shivappa@linux.intel.com Cc: gavin.hindman@intel.com Cc: jithu.joseph@intel.com Cc: dave.hansen@intel.com Cc: hpa@zytor.com Link: https://lkml.kernel.org/r/26f4fc25f110bfc07c2d2c8b2c4ee904922fedf7.1529706536.git.reinette.chatre@intel.com
| * | | | | x86/intel_rdt: Introduce the Cache Pseudo-Locking modesReinette Chatre2018-06-232-2/+21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The two modes used to manage Cache Pseudo-Locked regions are introduced. A resource group is assigned "pseudo-locksetup" mode when the user indicates that this resource group will be used for a Cache Pseudo-Locked region. When the Cache Pseudo-Locked region has been set up successfully after the user wrote the requested schemata to the "schemata" file, then the mode will automatically changed to "pseudo-locked". The user is not able to modify the mode to "pseudo-locked" by writing "pseudo-locked" to the "mode" file directly. Signed-off-by: Reinette Chatre <reinette.chatre@intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: fenghua.yu@intel.com Cc: tony.luck@intel.com Cc: vikas.shivappa@linux.intel.com Cc: gavin.hindman@intel.com Cc: jithu.joseph@intel.com Cc: dave.hansen@intel.com Cc: hpa@zytor.com Link: https://lkml.kernel.org/r/98d6ca129bbe7dd0932d1fcfeb3cbb65f29a8d9d.1529706536.git.reinette.chatre@intel.com
| * | | | | x86/intel_rdt: Display resource groups' allocations' size in bytesReinette Chatre2018-06-232-0/+83
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The schemata file displays the allocations associated with each domain of each resource. The syntax of this file reflects the capacity bitmask (CBM) of the actual allocation. In order to determine the actual size of an allocation the user needs to dig through three different files to query the variables needed to compute it (the cache size, the CBM length, and the schemata). Introduce a new file "size" associated with each resource group that will mirror the schemata file syntax and display the size in bytes of each allocation. Signed-off-by: Reinette Chatre <reinette.chatre@intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: fenghua.yu@intel.com Cc: tony.luck@intel.com Cc: vikas.shivappa@linux.intel.com Cc: gavin.hindman@intel.com Cc: jithu.joseph@intel.com Cc: dave.hansen@intel.com Cc: hpa@zytor.com Link: https://lkml.kernel.org/r/cc0058014c30adb88ca7d1a5abfadacbfb5edd0d.1529706536.git.reinette.chatre@intel.com
| * | | | | x86/intel_rdt: Introduce "bit_usage" to display cache allocations detailsReinette Chatre2018-06-231-0/+79
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | With cache regions now explicitly marked as "shareable" or "exclusive" we would like to communicate to the user how portions of the cache are used. Introduce "bit_usage" that indicates for each resource how portions of the cache are configured to be used. To assist the user to distinguish whether the sharing is from software or hardware we add the following annotation: 0 - currently unused X - currently available for sharing and used by software and hardware H - currently used by hardware only but available for software use S - currently used and shareable by software only E - currently used exclusively by one resource group Signed-off-by: Reinette Chatre <reinette.chatre@intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: fenghua.yu@intel.com Cc: tony.luck@intel.com Cc: vikas.shivappa@linux.intel.com Cc: gavin.hindman@intel.com Cc: jithu.joseph@intel.com Cc: dave.hansen@intel.com Cc: hpa@zytor.com Link: https://lkml.kernel.org/r/105d44c40e582c2b7e2dccf0ae247e5e61137d4b.1529706536.git.reinette.chatre@intel.com
| * | | | | x86/intel_rdt: Ensure requested schemata respects modeReinette Chatre2018-06-233-14/+41
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When the administrator requests a change in a resource group's schemata we have to ensure that the new schemata respects the current resource group as well as the other active resource groups' schemata. The new schemata is not allowed to overlap with the schemata of any exclusive resource groups. Similarly, if the resource group being changed is exclusive then its new schemata is not allowed to overlap with any schemata of any other active resource group. Signed-off-by: Reinette Chatre <reinette.chatre@intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: fenghua.yu@intel.com Cc: tony.luck@intel.com Cc: vikas.shivappa@linux.intel.com Cc: gavin.hindman@intel.com Cc: jithu.joseph@intel.com Cc: dave.hansen@intel.com Cc: hpa@zytor.com Link: https://lkml.kernel.org/r/b0c05b21110d3040fff45f4c1d2cfda8dba3f207.1529706536.git.reinette.chatre@intel.com
| * | | | | x86/intel_rdt: Support flexible data to parsing callbacksReinette Chatre2018-06-232-5/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Each resource is associated with a configurable callback that should be used to parse the information provided for the particular resource from user space. In addition to the resource and domain pointers this callback is provided with just the character buffer being parsed. In support of flexible parsing the callback is modified to support a void pointer as argument. This enables resources that need more data than just the user provided data to pass its required data to the callback without affecting the signatures for the callbacks of all the other resources. Signed-off-by: Reinette Chatre <reinette.chatre@intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: fenghua.yu@intel.com Cc: tony.luck@intel.com Cc: vikas.shivappa@linux.intel.com Cc: gavin.hindman@intel.com Cc: jithu.joseph@intel.com Cc: dave.hansen@intel.com Cc: hpa@zytor.com Link: https://lkml.kernel.org/r/34baacfced4d787d994ec7015e249e6c7e619053.1529706536.git.reinette.chatre@intel.com
| * | | | | x86/intel_rdt: Making CBM name and type more explicitReinette Chatre2018-06-231-4/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | cbm_validate() receives a pointer to the variable that will be initialized with a validated capacity bitmask. The pointer points to a variable of type unsigned long that is immediately assigned to a variable of type u32 by the caller on return from cbm_validate(). Let cbm_validate() initialize a variable of type u32 directly. At this time also change tha variable name "data" within parse_cbm() to a name more reflective of the content: "cbm_val". This frees up the generic "data" to be used later when it is indeed used for a collection of input. Signed-off-by: Reinette Chatre <reinette.chatre@intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: fenghua.yu@intel.com Cc: tony.luck@intel.com Cc: vikas.shivappa@linux.intel.com Cc: gavin.hindman@intel.com Cc: jithu.joseph@intel.com Cc: dave.hansen@intel.com Cc: hpa@zytor.com Link: https://lkml.kernel.org/r/5e29cf0209ea2deac9beacd35cbe5239a50959fb.1529706536.git.reinette.chatre@intel.com
| * | | | | x86/intel_rdt: Enable setting of exclusive modeReinette Chatre2018-06-231-1/+96
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The new "mode" file now accepts "exclusive" that means that the allocations of this resource group cannot be shared. Enable users to modify a resource group's mode to "exclusive". To succeed it is required that there is no overlap between resource group's current schemata and that of all the other active resource groups as well as cache regions potentially used by other hardware entities. Signed-off-by: Reinette Chatre <reinette.chatre@intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: fenghua.yu@intel.com Cc: tony.luck@intel.com Cc: vikas.shivappa@linux.intel.com Cc: gavin.hindman@intel.com Cc: jithu.joseph@intel.com Cc: dave.hansen@intel.com Cc: hpa@zytor.com Link: https://lkml.kernel.org/r/83642cbba3c8c21db7fa6bb36fe7d385d3b275f2.1529706536.git.reinette.chatre@intel.com
| * | | | | x86/intel_rdt: Introduce new "exclusive" modeReinette Chatre2018-06-232-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | At the moment all allocations are shareable. There is no way for a user to designate that an allocation associated with a resource group cannot be shared by another. Introduce the new mode "exclusive". When a resource group is marked as such it implies that no overlap is allowed between its allocation and that of another resource group. Signed-off-by: Reinette Chatre <reinette.chatre@intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: fenghua.yu@intel.com Cc: tony.luck@intel.com Cc: vikas.shivappa@linux.intel.com Cc: gavin.hindman@intel.com Cc: jithu.joseph@intel.com Cc: dave.hansen@intel.com Cc: hpa@zytor.com Link: https://lkml.kernel.org/r/f6d24672a4280fe3b24cd2da9b5f50214439c1af.1529706536.git.reinette.chatre@intel.com
| * | | | | x86/intel_rdt: Initialize new resource group with sane defaultsReinette Chatre2018-06-231-3/+112
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently when a new resource group is created its allocations would be those that belonged to the resource group to which its closid belonged previously. That is, we can encounter a case like: mkdir newgroup cat newgroup/schemata L2:0=ff;1=ff echo 'L2:0=0xf0;1=0xf0' > newgroup/schemata cat newgroup/schemata L2:0=0xf0;1=0xf0 rmdir newgroup mkdir newnewgroup cat newnewgroup/schemata L2:0=0xf0;1=0xf0 When the new group is created it would be reasonable to expect its allocations to be initialized with all regions that it can possibly use. At this time these regions would be all that are shareable by other resource groups as well as regions that are not currently used. If the available cache region is found to be non-contiguous the available region is adjusted to enforce validity. When a new resource group is created the hardware is initialized with these new default allocations. Signed-off-by: Reinette Chatre <reinette.chatre@intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: fenghua.yu@intel.com Cc: tony.luck@intel.com Cc: vikas.shivappa@linux.intel.com Cc: gavin.hindman@intel.com Cc: jithu.joseph@intel.com Cc: dave.hansen@intel.com Cc: hpa@zytor.com Link: https://lkml.kernel.org/r/c468ed79340b63024111978e01430bb9589d85c0.1529706536.git.reinette.chatre@intel.com
OpenPOWER on IntegriCloud